-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating a parser to transform the Matlab documentation into a structure #20
base: master
Are you sure you want to change the base?
Changes from 5 commits
fc3d0a9
8ff1554
4687abb
9d2cc67
bc33759
adf9958
333a304
e6e3348
005394e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
function docStruct = parse_doc(functionPath) | ||
%PARSE_DOC Generates a structure from the input function's documentation. | ||
% | ||
% SYNTAX | ||
% | ||
% docStruct = parse_doc(functionPath) | ||
% | ||
% DESCRIPTION | ||
% | ||
% Parses the function corresponding to the input path and fetches the | ||
% information corresponding to the function's documentation before | ||
% reorganising it as a structure | ||
% | ||
% INPUTS | ||
% | ||
% functionPath | ||
% Character array corresponding to the path of the function. | ||
% | ||
% OUTPUTS | ||
% | ||
% docStruct | ||
% Structure containing the different parts of the function's | ||
% documentation as its fields (summary, description, inputs, outputs, | ||
% and notes). | ||
% | ||
% NOTES | ||
% | ||
% It requires a total respect of the template (e.g no "forgotten" spaces). | ||
% All the fields (SYNTAX, DESCRIPTION, etc...) must be provided in the | ||
% parsed function. | ||
|
||
%% Read the function and keep only the description section | ||
functionTxt = fopen(functionPath); % Open the function file | ||
gaspardcereza marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
fgetl(functionTxt); % Skip the first line (function...) | ||
% Initialize a cell that will receive the lines in the description | ||
functionDoc = []; | ||
gaspardcereza marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
textLine = fgetl(functionTxt); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This line is duplicated with the one at line 55. could it be possible to fgetl at the begining of the while loop so you have it only once. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The condition for my while loop is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see. I just really don't like duplicated code. You could also test in an if statement with a break command: |
||
nLine = 1; | ||
headersPos = []; % Store the positions of the headers | ||
while length(textLine) >= 1 % Parsing will stop at first empty line | ||
if erase(textLine, '%') ~= 0 % Delete the empty commented lines | ||
|
||
functionDoc = [functionDoc string(erase(textLine, '%'))]; | ||
|
||
if sum(isstrprop(strip(erase(textLine, '%')), 'upper')) ==... | ||
length(strip(erase(textLine, '%'))) % Check uppercase | ||
|
||
headersPos = [headersPos nLine]; % Store the header line | ||
|
||
end | ||
nLine = nLine+1; | ||
end | ||
textLine = fgetl(functionTxt); % Reads the next line | ||
end | ||
|
||
fclose(functionTxt); % Close the function file | ||
headersPos = [headersPos length(functionDoc)+1]; % Position of the last line | ||
%% Fetch the function's summary | ||
docStruct.summary = functionDoc(1); % Store the summary of the function | ||
|
||
%% Fetch the other sections | ||
|
||
for nSection = 1:length(headersPos)-1 % Number of sections | ||
header = strip(functionDoc(headersPos(nSection))); % Section name | ||
sectionStart = headersPos(nSection)+1; % First line after header | ||
sectionEnd = headersPos(nSection+1)-1; % Last line before next header | ||
|
||
switch header | ||
case 'SYNTAX' | ||
docStruct.syntax = erase(functionDoc(sectionStart:sectionEnd),' '); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The code is really whitespace dependent and the problem is that there is a lot of characters that are refered as whitespaces (tab, space, null, ...). When someone will use a different number of spaces, everything will break. A good way of reducing this problem and making the documentation more flexible is adding a symbol in front of specific information ('->' in front of inputs, '<- in front of outputs', '@' in front of notes, '_' in front of types). This way you know that everything that is after a certain symbol is a certian type until you reach the next symbol. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes you're right but is it really easier to make sure everyone adds the right symbol in the right place or that they just respect the number of white spaces ? That's something I've been asking myself when I was coding but I don't have a clear preference for any of these options so I'm open to discussion. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The problem with the whitespaces is actually the implementation (tabs vs spaces vs null characters, etc.). There's a lot of ways to add space in a text editor. However with a symbol it's easy to know which one. |
||
case 'DESCRIPTION' | ||
docStruct.description = strjoin(strip(functionDoc(sectionStart:sectionEnd)),' '); | ||
case 'INPUTS' | ||
section = functionDoc(sectionStart:sectionEnd); | ||
docStruct.inputs.names = strip(section(cellfun('isempty', strfind(section,' ')))); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Try to be carreful with embeding functions in other functions. A lot of parenthesis can make the code less readable. sometimes, it is better to separate in multiple lines. |
||
docStruct.inputs.description = split(strjoin(replace(strip(section(2:end)),docStruct.inputs.names(:),'|||')),'|||')'; | ||
case 'OUTPUTS' | ||
section = functionDoc(sectionStart:sectionEnd); | ||
gaspardcereza marked this conversation as resolved.
Show resolved
Hide resolved
|
||
docStruct.outputs.names = strip(section(cellfun('isempty', strfind(section,' ')))); | ||
docStruct.outputs.description = split(strjoin(replace(strip(section(2:end)),docStruct.outputs.names(:),'|||')),'|||')'; | ||
case 'NOTES' | ||
docStruct.notes = strjoin(strip(functionDoc(sectionStart:end)),''); | ||
otherwise | ||
error('Unknown section name in the function documentation') | ||
end | ||
end |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
function [output1, output2] = test(arg1, arg2) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. isn't the convention to call test functions with suffix as the function to test, i.e.: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. didn't know that. I'll change it There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i don't know either-- i'm basing it just from python's pytest-- obviously things might be different in matlab-- please inform yourself how those tests are run-- @po09i @gab-berest @rtopfer might know There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd say that test names (as for unit test if I understand correctly) should have the name of the tested function AND the result. Maybe with more complexe function you'll need to test if the function does the correct thing, but also exceptions. For example: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It makes big function names, but when testing it will be clearer when something fails. |
||
%TEST Computes output1 and output 2 from arg1 and arg2. | ||
% | ||
% SYNTAX | ||
% | ||
% output1 = test(arg1, arg2) | ||
% [output1, output2] = test(arg1, arg2) | ||
% | ||
% DESCRIPTION | ||
% | ||
% Computes output1 as the sum of arg1 and arg2 and output2 as the | ||
% difference between arg1 and arg2. | ||
% | ||
% INPUTS | ||
% | ||
% arg1 | ||
% Scalar. This line is very long because i wanted to test if the input | ||
% description would be correctly parsed if it is longer than one line. | ||
% | ||
% arg2 | ||
% Scalar | ||
% | ||
% OUTPUTS | ||
% | ||
% output1 | ||
% Sum of arg1 and arg2 | ||
% | ||
% output2 | ||
% Difference between arg1 and arg2 | ||
% | ||
% NOTES | ||
% | ||
% That function is destined to test the parsing of the function | ||
% documentation (done by parse_doc.m). | ||
gaspardcereza marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
output1 = arg1+arg2; | ||
% We don't want that kind of comments to appear in the parsed documentation. | ||
gaspardcereza marked this conversation as resolved.
Show resolved
Hide resolved
|
||
output2 = arg1-arg2; % Neither this kind of comments | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what are the assertions for this test? how can it fail? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That test function is just some dummy function that will be parsed (it could even be simply deleted and we could test the parsing directly on There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. bits-by-bits-- good practice is: for every code that is written, a test should go along it. never "postpone" writing tests things that we postpone might take a while to be implemented and then we forget about it so the philosophy is: do it now, not tomorrow also: a test that never fails is not a test There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok I'll create a unit test right now then. Is it ok if I create a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @po09i is currently looking into this and can let us know what to do |
||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's something that was agreed previously, but is it necessary to have a struct? Isn't it a better idea to have a dictionary (map) so you can input any information needed in the header like examples, etc. and make a test at the end of the parse to check that minimal sections are present?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to go with a struct mainly because it was something I knew how to use. 😉 I'm not familiar with dictionaries but if you think that might be a better solution I'd be glad to discuss about it ! Is this what you are referring to ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's exactly what I'm thinking about. I think maps or set depending on what we want could be more versatile.