Skip to content

content API endpoint #3048

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
vladak opened this issue Feb 20, 2020 · 7 comments
Closed

content API endpoint #3048

vladak opened this issue Feb 20, 2020 · 7 comments
Assignees

Comments

@vladak
Copy link
Member

vladak commented Feb 20, 2020

Is your feature request related to a problem? Please describe.
#3042 listed a few new API endpoints that would be useful to have. This tracks the addition of the /content endpoint for getting content of a file.

Describe the solution you'd like
Return JSON array of objects:

  • line number
  • line content (verbatim, i.e. no xref)

The endpoint should support a parameter to specify revision of the file. It may support paging.

@vladak
Copy link
Member Author

vladak commented Feb 20, 2020

The above assumes text files that contain newlines. Perhaps the controller could verify the type of the file (AnalyzerGuru#getAnalyzerFor(file, path) comes to mind and check if returned object is instanceOf TextAnalyzer) and return empty object in case of binary files - these should be possible to get via the /download link (not an API endpoint).

@vladak
Copy link
Member Author

vladak commented Feb 21, 2020

After implementing the above (without pagination) I have to say it is not clear to me how beneficial is to have this endpoint. Anyone (authorized, if authorization config is in place) can download the content of any file, split it into lines and assign them line numbers. It sort of makes sense when combined with the text file detection (as suggested in the above comment) and maybe if pagination (for long files) is supported. @jbaek7023 ?

I was also thinking about exposing AnalyzerGuru#getGenre() to output the genre for given file (need to use it with InputStream otherwise it merely matches path to known prefixes/suffixes). There could be both /file/content and /file/genre endpoints.

@jbaek7023
Copy link

jbaek7023 commented Feb 21, 2020

I'm trying to implement this Front-End talking to OpenGrok API (Inspired by SourceGraph). Annotation and History can be shown by the toggling "Annotate" or "clicking CL" button.
But as you see, I still need to get the content (line and its content and file structure).

Screen Shot 2020-02-21 at 1 46 26 PM

I think your implementation will be super useful in the future for anyone who wants to OpenGrok indexing engine + I don't think we'll need a pagination here.

Thanks again for the quick turnaround...!! Your impact is big.

@idodeclare
Copy link
Contributor

The above assumes text files that contain newlines. Perhaps the controller could verify the type of the file (AnalyzerGuru#getAnalyzerFor(file, path) comes to mind and check if returned object is instanceOf TextAnalyzer) and return empty object in case of binary files - these should be possible to get via the /download link (not an API endpoint).

T field is defined in a Lucene doc for Genre for non-binary input. Only Plain would be applicable to this issue I think. No re-analysis from source text would be needed or desired I think

@jbaek7023
Copy link

jbaek7023 commented Feb 22, 2020

@idodeclare No, all the DOM tags are all valuable. I just want to override the style.css files. Which makes the analysis from source text will be necessary - I still want to keep the DOM tags () in the JSON data.

(Apologies in advance if this is a dumb thought)

@idodeclare
Copy link
Contributor

@jbaek7023 no I was referring to @vladak mentioning calling AnalyzerGuru#getAnalyzerFor(file, path) to determine what files are plain-text. No need since OpenGrok would have saved into Lucene a T field of PLAIN when applicable.

vladak pushed a commit to vladak/OpenGrok that referenced this issue Feb 24, 2020
- no line number support

fixes oracle#3048
@vladak
Copy link
Member Author

vladak commented Feb 24, 2020

thanks @idodeclare , that's what I needed.

@vladak vladak self-assigned this Feb 24, 2020
@vladak vladak closed this as completed in d7648fc Feb 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants