Elasticbox is a simple bare bone app for test TIKA and Elasticsearch features.
- JDK 8
- Git: follow installation instructions from site.
- Maven: follow installation instructions from site.
Make sure all executable programs (like java, mvn and git) are linked to your PATH environment variable
-
Main
-
Tools
-
Others
git clone https://github.com/fvalmeida/elasticbox.git
cd elasticbox
mvn package
-
After build, copy target directory content to your desired installation path
-
From elasticsearch directory, start it
Run
bin/elasticsearch
on Unix orbin/elasticsearch.bat
on Windows -
There are many ways to index files:
-
Copy elasticbox-tika-indexer.jar to desired path that will be indexed and run it
java -jar elasticbox-tika-indexer.jar
-
From installation path run elasticbox-tika-indexer.jar with
paths
argumentjava -jar elasticbox-tika-indexer.jar --paths=/Users/fvalmeida/Documents
Usage:
java -jar elasticbox-tika-indexer.jar <options> -?, -h, --help Show the help --index.name=<value> Elasticsearch index name (default: "elasticbox") --filter=<syntax:pattern> A filter that may be used to match paths against the pattern Supports the "glob" and "regex" syntaxes --paths=<comma-separated paths> Paths for index to Elasticsearch (default: "current directory") --recursive=<true|false> Index path recursively (default: true) --spring.data.elasticsearch.cluster-nodes=<comma-separated nodes> Elasticsearch cluster nodes (default: "localhost:9300") --thread-count=<number of threads> Max number of threads (default: 10) --error.logging.file=<value> Error logging file (default: "elasticbox.error.log") Examples: java -jar elasticbox-tika-indexer.jar java -jar elasticbox-tika-indexer.jar --recursive=false java -jar elasticbox-tika-indexer.jar --paths=/Documents --index.name=documents java -jar elasticbox-tika-indexer.jar --filter=glob:*.{pdf,doc}
-