Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a Bitdeli Badge to README #178

Merged
merged 92 commits into from
Jul 7, 2015
Merged
Changes from 1 commit
Commits
Show all changes
92 commits
Select commit Hold shift + click to select a range
bdd2ae5
Basic spellchecker worker ready
fccoelho Feb 22, 2013
6c7657b
Tests passing on spellchecker worker
fccoelho Feb 22, 2013
3a16e27
Added pre-instancing of checkers, as per sugestion of @turicas.
fccoelho Feb 26, 2013
a0a23fb
Added handling of unsupported language by returning None instead of l…
fccoelho Feb 27, 2013
33ff882
added a test for english
fccoelho Feb 27, 2013
c5aac9e
Removes workaround for nltk not returning unicode stopwords
flavioamieiro Nov 5, 2014
6c5b022
Merge branch 'bugfix/remove_workaround_for_nltk_issue' into develop
flavioamieiro Nov 5, 2014
d4c6873
Fixes `UnicodeDecodeError` in `PalavrasRaw` worker
flavioamieiro Nov 11, 2014
f196c62
Merge branch 'bugfix/fix_unicodedecodeerror_in_palavras_worker' into …
flavioamieiro Nov 11, 2014
43048ac
Makes sure we use the correct codec in all Palavras workers
flavioamieiro Nov 11, 2014
48360d8
Fixes even more `Unicode{En,De}codeError`s
flavioamieiro Nov 11, 2014
63e430c
Adds a property to tell if PalavrasRaw was run for this document
flavioamieiro Jan 16, 2015
ab61046
Makes sure workers that depend directly on palavras don't run if pala…
flavioamieiro Jan 16, 2015
5680345
Merge branch 'feature/fix_palavras_exceptions' into develop
flavioamieiro Jan 16, 2015
c7fe275
Merge pull request #146 from fccoelho/feature/spellchecker
fccoelho Mar 22, 2015
35908ca
Fixes tokenizer test
flavioamieiro Apr 14, 2015
cf36ac2
Fixes extractor test
flavioamieiro Apr 14, 2015
3eef9df
Adds pyenchant to requirements
flavioamieiro Apr 14, 2015
641957f
Updates palavras related tests
flavioamieiro Apr 14, 2015
d049be3
Merge branch 'fix_tests' into develop
flavioamieiro Apr 14, 2015
1d18679
WIP - First draft of worker using a celery task
flavioamieiro Apr 14, 2015
6b5706d
Removes unused code from the tokenizer (and it's test)
flavioamieiro Apr 15, 2015
acb90f1
Uses `task.delay().get()` instead of `task.apply()`
flavioamieiro Apr 15, 2015
2eeeff9
Adapts freqdist worker to use celery
flavioamieiro Apr 15, 2015
a9c3ab4
Adds first draft of a mongodict subclass to represent documents
flavioamieiro Apr 15, 2015
f7ca6dc
adds the file that defines the celery app (this was missing from prev…
flavioamieiro Apr 15, 2015
3184f65
Renames `MongoDictById` to `MongoDictAdapter`
flavioamieiro Apr 16, 2015
9e3d73a
Finishes `MongoDictAdapter`
flavioamieiro Apr 16, 2015
3244460
Improve tests for freqdist worker
flavioamieiro Apr 16, 2015
ec8ba0d
Creates a base class for all our workers
flavioamieiro Apr 16, 2015
d09f03f
Creates base test class for pypln tasks
flavioamieiro Apr 16, 2015
e8c511e
Uses fake_id consistently in freqdist test
flavioamieiro Apr 16, 2015
58112f7
Renames `tokenizer` -> `Tokenizer`
flavioamieiro Apr 16, 2015
0ef0079
Adds note about the import that is holding the app togheter
flavioamieiro Apr 16, 2015
aefb3aa
Migrates the tokenizer test to the new class based approach
flavioamieiro Apr 16, 2015
72c9e22
Migrates wordcloud worker to Celery
flavioamieiro Apr 18, 2015
8b15114
Migrates the Statistics worker to a Celery task
flavioamieiro Apr 19, 2015
6aefdaa
Migrates Bigrams worker to a Celery Task
flavioamieiro Apr 19, 2015
6b41dbc
Migrates `PalavrasRaw` worker to a Celery task
flavioamieiro Apr 20, 2015
97af711
Migrates palavras NounPhrase worker to a Celery task
flavioamieiro Apr 21, 2015
5fe3bf1
Migrates palavras SemmanticTagger worker to a Celery task
flavioamieiro Apr 21, 2015
9c370bf
Migrates POS worker to a Celery task
flavioamieiro Apr 21, 2015
d1a8b71
Adds test to check if POS worker routes portuguese documents to the p…
flavioamieiro Apr 21, 2015
8f3d0b7
Migrates Trigram worker to Celery task
flavioamieiro Apr 22, 2015
d23201c
Removes unnecessary import in `test_worker_wordcloud.py`
flavioamieiro Apr 22, 2015
e25eed3
Migrates Spellchecker worker to a Celery Task
flavioamieiro Apr 22, 2015
0dbdfb4
Migrates Lemmatizer worker to a Celery task
flavioamieiro Apr 22, 2015
51648d6
commented out pyrex from requirements/development.txt
fccoelho Apr 22, 2015
a028a73
Renames Celery app to 'pypln_workers'
flavioamieiro Apr 23, 2015
bd119d6
WIP: starts to change Extractor worker into a Celery task
flavioamieiro Apr 23, 2015
2f6fe45
Fixes and documents the issue with the app import
flavioamieiro Apr 23, 2015
f6173dd
Changes all the Extractor tests to use it as a Celery task
flavioamieiro Apr 23, 2015
f790d98
Adds Task to retrieve filedata from GridFS
flavioamieiro Apr 24, 2015
89a995b
Removes pypelinin structure
flavioamieiro Apr 24, 2015
01aed6a
Adds copyright notice in the files that didn't have it
flavioamieiro Apr 24, 2015
d73a821
Adds a config module
flavioamieiro Apr 28, 2015
807f115
Moves GridFS config to config module
flavioamieiro Apr 28, 2015
a26aad7
Substitutes Make target `run` by `run-celery`
flavioamieiro Apr 29, 2015
3c1a4ab
Uses a dictionary with mongodb configuration
flavioamieiro Apr 29, 2015
b4b15f2
Makes sure tests only run if the database name starts with `test`
flavioamieiro Apr 29, 2015
1ffd33d
Adds the possibility of having a local configuration module
flavioamieiro Apr 29, 2015
1ef7252
Fixes 'tests' and 'tests-x' make targets
flavioamieiro Apr 29, 2015
b0a7e43
Updates README to reflect the changes in the project
flavioamieiro Apr 29, 2015
e2f3ec3
Adds GridFSDataRetriever to the exported attributes of pypln.backend.…
flavioamieiro Apr 30, 2015
3d06fb3
Adds script to run celery in production
flavioamieiro May 7, 2015
25ab920
Makes run_celery.sh script executable
flavioamieiro May 7, 2015
e5fee58
Makes sure `GridFSDataRetriever` connects to the correct mongo database
flavioamieiro May 11, 2015
5ed6e42
Makes sure we use the correct hostname and port when using MongoDictA…
flavioamieiro May 12, 2015
217903d
Merge branch 'feature/celery' into develop
flavioamieiro May 18, 2015
02dd767
Gets pypln storage configuration from config file if available
flavioamieiro May 18, 2015
fc9013a
Adds a small section to `README.rst` about creating new workers
flavioamieiro May 19, 2015
44454fe
Update README.rst
fccoelho May 20, 2015
b609af4
Implements a worker to index documents in an elasticsearch server. Bu…
fccoelho May 20, 2015
6b032da
Added test for elastic_indexer
fccoelho May 20, 2015
6d973c6
Pins the pymongo version for now
flavioamieiro May 20, 2015
129c28c
Adds configuration for the result backend and the message broker
flavioamieiro May 20, 2015
8e3324f
Adds celery username and password to configuration
flavioamieiro May 20, 2015
e1475a2
Adds index_name as a parameter to the indexing call
flavioamieiro May 25, 2015
9a7602f
Ignores error when trying to delete a index that still doesn't exist …
flavioamieiro May 25, 2015
c8459c0
Fixes typo in the Indexer test name and removes trailing whitespace
flavioamieiro May 25, 2015
b91523d
Changes test index name
flavioamieiro May 25, 2015
1c693f2
Adds `ElasticIndexer` to the list of exported workers
flavioamieiro Jun 16, 2015
2aaa10f
Uses the file_id generated by gridfs instead of id generated by postgres
flavioamieiro Jun 16, 2015
e2d7748
Fixes ElasticIndexer test
flavioamieiro Jun 19, 2015
21debe8
Removes unnecessary trailing lines in `elastic_indexer.py`
flavioamieiro Jun 19, 2015
983fcb4
Merge pull request #174 from flavioamieiro/feature/elastic-indexer
fccoelho Jun 22, 2015
e4d0cb8
Adds a worker that deletes a file from GridFS
flavioamieiro Jun 22, 2015
9ae6f25
Merge pull request #175 from flavioamieiro/feature/delete-file-worker
fccoelho Jun 22, 2015
8ab3b3e
Removes unused variable declaration
flavioamieiro Jun 22, 2015
579e0bc
Fixes ElasticIndexer for binary files
flavioamieiro Jun 26, 2015
cba555a
Merge pull request #177 from flavioamieiro/bugfix/indexing_contents
fccoelho Jun 26, 2015
97ffeb1
Add a Bitdeli badge to README
bitdeli-chef Jul 7, 2015
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Added test for elastic_indexer
fccoelho committed May 20, 2015
commit 6b032da7b1342fdd2ae5264ac96c78adb13e2820
2 changes: 1 addition & 1 deletion pypln/backend/config.py
Original file line number Diff line number Diff line change
@@ -19,7 +19,7 @@ def get_store_config():

MONGODB_CONFIG = get_store_config()
ELASTICSEARCH_CONFIG = {
'hosts': ['172.16.4.46', '172.16.4.52'],
'hosts': ['127.0.0.1', '172.16.4.46', '172.16.4.52'],
}

try:
34 changes: 34 additions & 0 deletions tests/test_elastic_indexer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
#-*- coding:utf-8 -*-
u"""
Created on 20/05/15
by fccoelho
license: GPL V3 or Later
"""

__docformat__ = 'restructuredtext en'


from pypln.backend.workers.elastic_indexer import ElasticIndexer
from .utils import TaskTest
from elasticsearch import Elasticsearch


class TestIndexa(TaskTest):
def test_indexing_go_through(self):
ES = Elasticsearch()
ES.indices.delete('test')
ES.indices.create('test')
doc = {
'index_name': "test",
'doc_type': 'document',
'pypln_id': 1,
'text': "Om nama Shivaya "*100
}

self.document.update(doc)
ElasticIndexer().delay(self.fake_id)
assert self.document['created'] # must be True