Skip to content

Releases: HHN/crawler4j

v5.1.0

22 Oct 08:08
Compare
Choose a tag to compare

What's new?

This release introduces several key improvements and dependency upgrades, ensuring enhanced performance and compatibility. A major change is the switch from the heavier Apache Tika Standard Parser package to the lightweight HTML module. If your project requires parsing binary content, you will now need to manually add the Apache Tika Standard Parser dependency as follows:

<dependency>
   <groupId>org.apache.tika</groupId>
   <artifactId>tika-parsers-standard</artifactId>
   <version>3.0.0</version>
</dependency>

Make sure to add this alongside your existing Crawler4j dependencies to maintain binary content parsing capabilities.

Please also note, that it now uses ´slf4j2` as logging facade.

Auto Generated Changelog

  • Bump flyway-core from 9.18.0 to 9.19.4 by @dependabot in #223
  • Bump jackson-core from 2.15.1 to 2.15.2 by @dependabot in #220
  • Bump license-maven-plugin from 2.0.1 to 2.1.0 by @dependabot in #222
  • Bump hsqldb from 2.7.1 to 2.7.2 by @dependabot in #219
  • Bump maven-surefire-plugin from 3.1.0 to 3.1.2 by @dependabot in #221
  • Bump commons-io from 2.12.0 to 2.13.0 by @dependabot in #225
  • Bump versions-maven-plugin from 2.15.0 to 2.16.0 by @dependabot in #224
  • Bump httpcore5 from 5.2.1 to 5.2.2 by @dependabot in #226
  • Bump httpcore5-h2 from 5.2.1 to 5.2.2 by @dependabot in #228
  • Bump license-maven-plugin from 2.1.0 to 2.2.0 by @dependabot in #229
  • Bump flyway-core from 9.19.4 to 9.20.0 by @dependabot in #227
  • Bump org.flywaydb:flyway-core from 9.20.0 to 9.21.0 by @dependabot in #233
  • Bump org.junit.jupiter:junit-jupiter from 5.9.3 to 5.10.0 by @dependabot in #231
  • Bump org.flywaydb:flyway-core from 9.21.0 to 9.21.1 by @dependabot in #234
  • Bump com.helger:ph-css from 7.0.0 to 7.0.1 by @dependabot in #235
  • Bump org.flywaydb:flyway-core from 9.21.1 to 9.22.0 by @dependabot in #241
  • Bump org.flywaydb:flyway-core from 9.22.0 to 9.22.1 by @dependabot in #244
  • Bump com.github.tomakehurst:wiremock-jre8 from 2.35.0 to 2.35.1 by @dependabot in #243
  • Bump org.apache.maven.plugins:maven-enforcer-plugin from 3.3.0 to 3.4.1 by @dependabot in #242
  • Bump apache.tika.version from 2.8.0 to 2.9.0 by @dependabot in #239
  • Bump org.apache.maven.plugins:maven-javadoc-plugin from 3.5.0 to 3.6.0 by @dependabot in #248
  • Bump org.codehaus.mojo:versions-maven-plugin from 2.16.0 to 2.16.1 by @dependabot in #249
  • Bump org.apache.httpcomponents.core5:httpcore5-h2 from 5.2.2 to 5.2.3 by @dependabot in #247
  • Bump org.apache.httpcomponents.core5:httpcore5 from 5.2.2 to 5.2.3 by @dependabot in #246
  • Bump commons-io:commons-io from 2.13.0 to 2.15.0 by @dependabot in #255
  • Bump com.fasterxml.jackson.core:jackson-core from 2.15.2 to 2.16.0 by @dependabot in #258
  • Bump org.flywaydb:flyway-core from 9.22.1 to 10.0.1 by @dependabot in #257
  • Bump de.thetaphi:forbiddenapis from 3.5.1 to 3.6 by @dependabot in #251
  • Bump apache.tika.version from 2.9.0 to 2.9.1 by @dependabot in #259
  • Bump org.codehaus.mojo:versions-maven-plugin from 2.16.1 to 2.16.2 by @dependabot in #260
  • Bump log4j.version from 2.20.0 to 2.22.0 by @dependabot in #261
  • Bump org.junit.jupiter:junit-jupiter from 5.10.0 to 5.10.2 by @dependabot in #263
  • Bump org.flywaydb:flyway-core from 10.0.1 to 10.8.1 by @dependabot in #268
  • Bump org.apache.httpcomponents.core5:httpcore5-h2 from 5.2.3 to 5.2.4 by @dependabot in #267
  • Bump com.fasterxml.jackson.core:jackson-core from 2.16.0 to 2.16.1 by @dependabot in #266
  • Bump org.assertj:assertj-core from 3.24.1 to 3.25.3 by @dependabot in #264
  • Bump log4j.version from 2.22.0 to 2.23.0 by @dependabot in #271
  • Bump org.postgresql:postgresql from 42.6.0 to 42.7.2 by @dependabot in #269
  • Bump org.apache.maven.plugins:maven-compiler-plugin from 3.11.0 to 3.12.1 by @dependabot in #270
  • Bump slf4j.version from 1.7.36 to 2.0.12 by @dependabot in #274
  • Bump commons-io:commons-io from 2.15.0 to 2.17.0 by @dependabot in #280
  • Bump org.apache.maven.plugins:maven-javadoc-plugin from 3.6.0 to 3.10.1 by @dependabot in #281
  • Bump de.thetaphi:forbiddenapis from 3.6 to 3.8 by @dependabot in #284
  • Bump com.fasterxml.jackson.core:jackson-core from 2.16.1 to 2.18.0 by @dependabot in #285
  • Bump org.hsqldb:hsqldb from 2.7.2 to 2.7.3 by @dependabot in #283
  • Bump org.apache.httpcomponents.core5:httpcore5 from 5.2.3 to 5.3 by @dependabot in #282
  • Bump org.apache.httpcomponents.core5:httpcore5-h2 from 5.2.4 to 5.3 by @dependabot in #286
  • Bump slf4j.version from 2.0.12 to 2.0.16 by @dependabot in #289
  • Bump org.codehaus.mojo:versions-maven-plugin from 2.16.2 to 2.17.1 by @dependabot in #288
  • Bump org.apache.maven.plugins:maven-surefire-plugin from 3.1.2 to 3.5.1 by @dependabot in #287
  • Bump org.apache.maven.plugins:maven-archetype-plugin from 3.2.1 to 3.3.0 by @dependabot in #290
  • Bump org.apache.maven.archetype:archetype-packaging from 3.2.1 to 3.3.0 by @dependabot in #292
  • Bump log4j.version from 2.23.0 to 2.24.1 by @dependabot in #291
  • Bump org.apache.httpcomponents.client5:httpclient5 from 5.2.1 to 5.4 by @dependabot in #293
  • Bump org.assertj:assertj-core from 3.25.3 to 3.26.3 by @dependabot in #294
  • Bump com.helger:ph-css from 7.0.1 to 7.0.3 by @dependabot in #295
  • Bump org.awaitility:awaitility from 4.2.0 to 4.2.2 by @dependabot in #297
  • Bump org.apache.maven.plugins:maven-source-plugin from 3.3.0 to 3.3.1 by @dependabot in #296
  • Bump org.postgresql:postgresql from 42.7.2 to 42.7.4 by @dependabot in #300
  • Bump org.apache.maven.plugins:maven-gpg-plugin from 3.1.0 to 3.2.7 by @dependabot in #298
  • Bump com.github.tomakehurst:wiremock-jre8 from 2.35.1 to 2.35.2 by @dependabot in #299
  • Bump org.apache.maven.plugins:maven-jar-plugin from 3.3.0 to 3.4.2 by @dependabot in #301
  • Bump org.flywaydb:flyway-core from 10.8.1 to 10.20.0 by @dependabot in #306
  • Bump org.codehaus.mojo:license-maven-plugin from 2.2.0 to 2.4.0 by @dependabot in #303
  • Bump com.github.crawler-commons:urlfrontier-API from 2.3.1 to 2.4 by @dependabot in #305
  • Bump com.mchange:c3p0 from 0.9.5.5 to 0.10.1 by @dependabot in #304
  • Bump org.junit.jupiter:junit-jupiter from 5.10.2 to 5.11.3 by @dependabot in #308
  • Bump org.jacoco:jacoco-maven-plugin from 0.8.10 to 0.8.12 by @dependabot in #310
  • Bump com.zaxxer:HikariCP from 5.0.1 to 6.0.0 by @dependabot in #309
  • Bump com.helger:ph-css from 7.0.1 to 7.0.3 by @dependabot in #307
  • Bump org.apache.maven.plugins:maven-compiler-plugin from 3.12.1 to 3.13.0 by @dependabot in #311
  • Bump org.apache.maven.plugins:maven-enforcer-plugin from 3.4.1 to 3.5.0 by @dependabot in #312
  • Releae 5.1.0 by @rzo1 in #313

Full Changelog: v5.0.2...v5.1.0

v5.0.2

23 May 12:06
Compare
Choose a tag to compare

What's Changed

Full Changelog: v5.0.1...v5.0.2

v5.0.1

12 Dec 13:17
Compare
Choose a tag to compare

What's Changed

Full Changelog: v5.0.0...v5.0.1

v5.0.0

24 Aug 11:37
Compare
Choose a tag to compare

What's Changed

Full Changelog: v4.10.1...v5.0.0

v4.10.1

15 Aug 12:48
Compare
Choose a tag to compare

What's Changed

Full Changelog: v4.10.0...v4.10.1

v4.10.0

10 Jul 19:01
Compare
Choose a tag to compare

What's Changed

Full Changelog: v4.9.1...v4.10.0

v4.9.1

13 Jun 07:40
Compare
Choose a tag to compare

What's Changed

New Contributors

  • @brbog made their first contribution in #71

Full Changelog: v.4.9.0...v4.9.1

v4.9.0

04 May 09:51
Compare
Choose a tag to compare

What's Changed

Breaking Change

  • Removal of IOException in CrawlController.addSeed(String): #61

Dependency

Full Changelog: v4.8.3...v.4.9.0

v4.8.3

05 Apr 16:34
Compare
Choose a tag to compare

What's Changed

Full Changelog: v4.8.2...v4.8.3

v4.8.2

28 Feb 08:25
Compare
Choose a tag to compare

What's Changed

  • Change of Default Behaviour: Politeness is now applied per host (rather than per request). To restore the "old" behavoir, you can use theSimplePolitnessServer as constructor parameter of PageFetcher
  • Bump postgresql from 42.3.2 to 42.3.3 by @dependabot in #41
  • Bump flyway-core from 8.4.4 to 8.5.0 by @dependabot in #38
  • Bump spock-core from 2.0-groovy-3.0 to 2.1-groovy-3.0 by @dependabot in #39

Full Changelog: v.4.8.1...v4.8.2