Skip to content

fvalmeida/elasticbox

Repository files navigation

Elasticbox

Elasticbox is a simple bare bone app for test TIKA and Elasticsearch features.

Requisites

  • JDK 8
  • Git: follow installation instructions from site.
  • Maven: follow installation instructions from site.

Make sure all executable programs (like java, mvn and git) are linked to your PATH environment variable

Technology stack

Installation

git clone https://github.com/fvalmeida/elasticbox.git
cd elasticbox
mvn package

Running

  1. After build, copy target directory content to your desired installation path

  2. From elasticsearch directory, start it

    Run bin/elasticsearch on Unix or bin/elasticsearch.bat on Windows

  3. There are many ways to index files:

    • Copy elasticbox-tika-indexer.jar to desired path that will be indexed and run it

      java -jar elasticbox-tika-indexer.jar
      
    • From installation path run elasticbox-tika-indexer.jar with paths argument

      java -jar elasticbox-tika-indexer.jar --paths=/Users/fvalmeida/Documents
      

    Usage:

    java -jar elasticbox-tika-indexer.jar <options>
    
       -?, -h, --help
          Show the help
    
       --index.name=<value>
          Elasticsearch index name (default: "elasticbox")
    
       --filter=<syntax:pattern>
          A filter that may be used to match paths against the pattern
          Supports the "glob" and "regex" syntaxes
    
       --paths=<comma-separated paths>
          Paths for index to Elasticsearch (default: "current directory")
    
       --recursive=<true|false>
           Index path recursively (default: true)
    
       --spring.data.elasticsearch.cluster-nodes=<comma-separated nodes>
           Elasticsearch cluster nodes (default: "localhost:9300")
    
       --thread-count=<number of threads>
          Max number of threads (default: 10)
    
       --error.logging.file=<value>
          Error logging file (default: "elasticbox.error.log")
    
    Examples:
       java -jar elasticbox-tika-indexer.jar
       java -jar elasticbox-tika-indexer.jar --recursive=false
       java -jar elasticbox-tika-indexer.jar --paths=/Documents --index.name=documents
       java -jar elasticbox-tika-indexer.jar --filter=glob:*.{pdf,doc}
    

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published