GitHub - foscraft/beatrice-project: BeatriceVec is a powerful Python package/tool designed for generating word embeddings in the dimension of 600, without relying on any third-party packages.

BeatriceVec

BeatriceVec is a powerful Python package/tool designed for generating word embeddings in the dimension of 600, without relying on any third-party packages. Word embeddings are vector representations of words that capture semantic relationships and meaning in a numerical format, enabling various natural language processing (NLP) tasks such as word similarity, text classification, and information retrieval.

With BeatriceVec, users can transform textual data into meaningful vector representations. These embeddings can capture semantic relationships between words, enabling algorithms and models to understand context and similarities between different words. This capability proves particularly useful in tasks such as sentiment analysis, language translation, and recommendation systems.

Create your embeddings with BeatriceVec and use them to query your model locally without using the internet.

It utilizes a dimensionality of 600, providing a rich representation space that can capture nuanced semantic information. By incorporating a higher dimensionality, the embeddings can potentially encode more complex relationships and capture finer-grained distinctions between words, leading to improved performance in downstream NLP tasks.

The package offers a user-friendly interface and straightforward API, making it accessible for both beginners and experienced practitioners. It provides functions to train custom word embeddings on user-specific text corpora, allowing users to fine-tune embeddings according to their specific domain or application requirements.

It empowers developers and researchers to explore the world of word embeddings and leverage the power of contextual word representations in their NLP projects. Its self-contained implementation, high-dimensional embeddings, and ease of use make it a valuable tool for tasks such as text analysis, information retrieval, and language understanding.

Overall, BeatriceVec is a reliable and efficient Python package for generating word embeddings, offering flexibility, performance, and ease of use to enhance various NLP applications and empower developers in the field of natural language processing.

Installation

Install package or wheel, both are found in dist folder

#WHEEL
pip install beatricevec-1.0.1-py3-none-any.whl

#PACKAGE
pip install beatricevec-1.0.1.tar.gz

Download the wheel or package here

Usage

from beatricevec import BeatriceVec

corpus = ["I am learning", "Natural language processing", "with BeatriceVec"]
embedder = BeatriceVec(corpus)
embedder.build_vocab()
embedder.initialize_word_vectors()
embedder.train()

embeddings = embedder.get_embeddings()

for embedding in embeddings:
    print(embedding)

Documentation

Methods

build_vocab(): Builds the vocabulary from the corpus.
initialize_word_vectors(): Initializes the word vectors with random values.
train(): Trains the embedding model using the Word2Vec algorithm.
update_vector(vector: list, context_vector: list): Updates the target word vector using gradient descent.
get_embeddings() -> list: Retrieves the embeddings for all words in the vocabulary.
get_embedding(word: str) -> list: Retrieves the embedding vector for a given word.

License

BeatriceVec is released under the Apache2.0 License.

How to CONTRIBUTE

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.vscode		.vscode
beatricevec		beatricevec
dist		dist
local		local
media		media
tests		tests
.flake8		.flake8
.gitignore		.gitignore
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
describe.md		describe.md
mypy.ini		mypy.ini
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BeatriceVec

Installation

Usage

Documentation

Methods

License

About

Releases 1

Packages

Languages

License

foscraft/beatrice-project

Folders and files

Latest commit

History

Repository files navigation

BeatriceVec

Installation

Usage

Documentation

Methods

License

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages