-
Notifications
You must be signed in to change notification settings - Fork 33
WIP SRUopener #682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
WIP SRUopener #682
Conversation
|
Nice seems to work. +1 |
|
@dr0i is still in review? |
|
As we found out in #510 this PR needs a complete redesign. |
|
@TobiasNx can you do functional tests before I go on here? Have a look at the |
|
@dr0i I tried to install the dist: https://metafacture.github.io/metafacture-documentation/docs/flux/Flux-User-Guide.html#build-from-local-distribution to try the runner for functional testing but it runs into errors: When I test the flux.sh then it outputs the following: Can you help? (I tested the current master to compare, there |
|
Ah, I accidently removed the TarReader. |
|
Current version stucks in an endless SRU request loop starting by 1 again after finishing all request does not matter if a total number of records is given or not: e.g. Both result in, see that recordPosition 1 is turning up again after the expected last recordPosition 8: <?xml version="1.0" encoding="UTF-8"?><searchRetrieveResponse xmlns="http://www.loc.gov/zing/srw/"><version>1.1</version><numberOfRecords>8</numberOfRecords><records><record><recordSchema>MARC21plus-xml</recordSchema><recordPacking>xml</recordPacking><recordData><collection xmlns="http://www.loc.gov/MARC21/slim">
<record type="Authority">
<leader>00000nz a2200000nc 4500</leader>
<controlfield tag="001">042278333</controlfield>
<controlfield tag="003">DE-101</controlfield>
<controlfield tag="005">20110429135047.0</controlfield>
<controlfield tag="008">900305n||azznnaabn | ana |c</controlfield>
<datafield ind1="7" ind2=" " tag="024">
<subfield code="a">4227833-8</subfield>
<subfield code="0">http://d-nb.info/gnd/4227833-8</subfield>
<subfield code="2">gnd</subfield>
</datafield>
<datafield ind1=" " ind2=" " tag="035">
<subfield code="a">(DE-101)042278333</subfield>
...
<datafield ind1=" " ind2=" " tag="913">
<subfield code="S">swd</subfield>
<subfield code="i">k</subfield>
<subfield code="a">Internationaler Sozialistenkongress</subfield>
<subfield code="0">(DE-588c)4021089-3</subfield>
</datafield>
</record>
</collection></recordData><recordPosition>8</recordPosition></record></records><echoedSearchRetrieveRequest><version>1.1</version><query>WOE=sozialistenkongress and COD=s</query><xQuery xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:nil="true"/><startRecord>6</startRecord><maximumRecords>5</maximumRecords><recordSchema>MARC21plus-xml</recordSchema></echoedSearchRetrieveRequest></searchRetrieveResponse>
<?xml version="1.0" encoding="UTF-8"?><searchRetrieveResponse xmlns="http://www.loc.gov/zing/srw/"><version>1.1</version><numberOfRecords>8</numberOfRecords><records><record><recordSchema>MARC21plus-xml</recordSchema><recordPacking>xml</recordPacking><recordData><collection xmlns="http://www.loc.gov/MARC21/slim">
<record type="Authority">
<leader>00000nz a2200000nc 4500</leader>
<controlfield tag="001">042278333</controlfield>
<controlfield tag="003">DE-101</controlfield>
<controlfield tag="005">20110429135047.0</controlfield>
<controlfield tag="008">900305n||azznnaabn | ana |c</controlfield>
<datafield ind1="7" ind2=" " tag="024">
<subfield code="a">4227833-8</subfield>
<subfield code="0">http://d-nb.info/gnd/4227833-8</subfield>
<subfield code="2">gnd</subfield>
...
</datafield>
<datafield ind1=" " ind2=" " tag="913">
<subfield code="S">swd</subfield>
<subfield code="i">c</subfield>
<subfield code="a">Bern / Internationaler Sozialistenkongress <1919></subfield>
<subfield code="0">(DE-588c)4227833-8</subfield>
</datafield>
</record>
</collection></recordData><recordPosition>1</recordPosition></record><record><recordSchema>MARC21plus-xml</recordSchema><recordPacking>xml</recordPacking><recordData><collection xmlns="http://www.loc.gov/MARC21/slim">
<record type="Authority">
<leader>00000nz a2200000nc 4500</leader>
<controlfield tag="001">1267605979</controlfield>
<controlfield tag="003">DE-101</controlfield>
<controlfield tag="005">20230329111229.0</controlfield>
<controlfield tag="008">220908n||azznnaabn | ana |c</controlfield>
<datafield ind1="7" ind2=" " tag="024">
<subfield code="a">1267605979</subfield>
<subfield code="0">http://d-nb.info/gnd/1267605979</subfield>
<subfield code="2">gnd</subfield>
... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that SRU opener stucks in infinite loop. See: #682 (comment)
|
@TobiasNx Can you update here that this is no bug but caused by marcxmlplus (or so) ? |
|
@dr0i The behaviour I reported was not the problem from the workshop. But the identation behaviour i reported is related to |
Every single output is a valid XML by itself.
|
Hi @blackwinter if you have some time: can you implement tests here? Functional-wise is the modul ready. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made a first pass and left some comments. We should discuss tests after the open questions are resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fails Checkstyle check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solved in 415dece besides the ClassFanOutComplexity complain. What to do about it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checkstyle-disable-line it for now.
| DocumentBuilder docBuilder = factory.newDocumentBuilder(); | ||
| Document xmldoc = docBuilder.parse(inputStreamOfURl); | ||
|
|
||
| Transformer t = TransformerFactory.newInstance().newTransformer(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto (made them final)
metafacture-io/src/test/java/org/metafacture/io/SruOpenerTest.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Jens Wille <[email protected]>
…-core into 510-addSruOpener
- adjust test query to work with URLencoding Complements 2fd8da6. i#Bitte geben Sie eine Commit-Beschreibung für Ihre Änderungen ein. Zeilen,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from the tests, should be mostly okay. Left some more comments regarding small optimizations and naming improvements.
P.S.: Please let the original commenter resolve any open conversations. They should have the opportunity to decide whether they're satisfied with your proposed solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checkstyle-disable-line it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fails Checkstyle check.
| @Override | ||
| public void process(final String baseUrl) { | ||
|
|
||
| final StringBuilder srUrl = new StringBuilder(baseUrl); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be a String to begin with (no need to build the string for every iteration).
Throw the exception first, then either build the string once or just concatenate all its elements.
| try { | ||
| final InputStream inputStreamOfURl = retrieveUrl(srUrl); | ||
| final DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); | ||
| final DocumentBuilder docBuilder = factory.newDocumentBuilder(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the DocumentBuilder be reused? Then move it outside the loop or make it static.
| final DocumentBuilder docBuilder = factory.newDocumentBuilder(); | ||
| final Document xmldoc = docBuilder.parse(inputStreamOfURl); | ||
|
|
||
| final Transformer t = TransformerFactory.newInstance().newTransformer(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Give t a more descriptive name.
Co-authored-by: Jens Wille <[email protected]>
This is a draft and WIP.
@TobiasNx you can use it for functional testing.
Resolves #510.