Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SystemVerilog: user defined type support #2653

Merged
merged 22 commits into from
Oct 5, 2020

Conversation

hirooih
Copy link
Contributor

@hirooih hirooih commented Sep 27, 2020

@masatake san,
I've working on #1488 and #2413. It is not completed yet, but let me share the current status with you.

First I checked uncovered lines in verilog.c and add some test cases.
After that I made several refactoring changes.
Until now I made changes which did not affect on the output. Next I will work on variable/net type checking routines.

In addition to the Unit Tests I've been using source code of UVM 1.2 and an opensource RISC CPU to ensure output was not changed.

This is a draft pull request. Your feedback and advice are welcome.
Thank you.

…sk-function.d

These covers some of uncovered lines.
Incorrect in extected.tags due to known issues are marked as FIXME.
processTypedef: remove skipping {...} obsoleted by processStruct (cf. universal-ctags#2643)
createTag(): divide covered/uncovered condition.
findVerilogTags(): make more simple and robust
mark uncovered lines as "FIXME: uncoverd"
@coveralls
Copy link

coveralls commented Sep 27, 2020

Coverage Status

Coverage increased (+0.03%) to 86.997% when pulling 924ef4e on hirooih:sv-type-refactoring into 86bdb84 on universal-ctags:master.

@masatake masatake self-requested a review September 27, 2020 22:53
K_CONSTANT,
/* parser private items */
K_IGNORE = -16,
K_DEFINE,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s it the best way to tag "foo" of "`define foo ..." with constant kind?
Introducing "d/definition" kind and tagging the "foo" with the definition kind is an alternative way.

Using the unified kind "constant" in many areas reduces the information that client tools get from a tags file.
In my experience, it is better to tag different things with different kinds. If ctags hides the differences, and just reports various language objects as "constant," in some cases, a client tool must parse the raw .sv source file for distinguishing whether a given tag is "`define"'ed or not.

IMHO, ctags should not be too smart. Instead, ctags should be stupid; it should report what it sees in source input files as-is to client tools that will do something smart.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, what I wrote here is pointless; introducing the d/definition kind breaks compatibility.
So introducing the kind is not so easy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right.
And I also agree with you that `define should have its own kind if we are allowed some incompatibility.

@masatake
Copy link
Member

I reviewed up to 69e311f.

@@ -221,6 +225,8 @@ static tokenInfo *tagContents = NULL;
* FUNCTION DEFINITIONS
*/

static void updateKind (tokenInfo *const token);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A minor issue.
updateKind here is not a function definition. So I would like you to introduce FUNCTION PROTOTYPES section.
See the following example taken from c.c:

/*
*   FUNCTION PROTOTYPES
*/
static void createTags (const unsigned int nestLevel, statementInfo *const parent);

/*
*   FUNCTION DEFINITIONS
*/

/*
*   Token management
*/

static void initToken (tokenInfo* const token)
{
	token->type			= TOKEN_NONE;

{
/* Drop context, but only if an end token is found */
dropEndContext (token);
}
if (token->kind == K_END_DE)
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that this line is called only for broken (invalid syntax) input.
In that case, putting verbose() here may help us in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is called for correct input, endfunction, endmodule, endtask, ..., etc.


Your comment reminds me one more question.
I think we should call notice() instead of verbose() for broken input.
And every parser should use same format something like this this.

How do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think notice() is too noisy. Furthermore, verbose is also noise for the purpose. Remember ctags will deal with large codebases.

Printing such information has two purposes.

  • reducing pointless bug reports
    We know the motto, "if the input is broken, ctags may emit garbages". However, users may not know.
    As the result, we have chances to receive bug reports of ctags to that broken input is given.
    The notice can notify that broken input is given to ctags.

  • debugging
    This is for us, not for users. For this purpose, verbose() and notice() are not suitable.
    We can use macros in trace.h. See parsers/fypp.c.
    The macros can be used only with ctags built with ./configure --enable-debugging.
    --_trace= must be passed to ctags at runtime.

@masatake
Copy link
Member

masatake commented Oct 1, 2020

I reviewed up to "SystemVerilog: mark fork as K_BEGIN and join* as K_END to create a co…" (bc88a0c).

@@ -582,6 +582,7 @@ static void skipToSemiColon (void)
} while (c != EOF && c != ';');
}

/* read an identifier, keyword, number, compiler directive, or macro identifier */
static bool readIdentifier (tokenInfo *const token, int c)
Copy link
Member

@masatake masatake Oct 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you think about renaming this function to "readToken"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been thinking exactly same thing.
I will rename them after this issue is complete.

{
if (token->kind == K_UNDEFINED)
{
for (int i = 0; i < token->name->length; i++)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use vStringLength (token->name) instead of token->name->length..

{
for (int i = 0; i < token->name->length; i++)
{
int c = token->name->buffer[i];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vStringChar (token->name, i) can be used.

@masatake
Copy link
Member

masatake commented Oct 1, 2020

Mostly it looks good to me.
I pointed out some minor issues as a result of the review.

@masatake
Copy link
Member

masatake commented Oct 1, 2020

BTW, you can make an issue for tracking SystemVerilog issued as I did for Ruby in #2480.

@hirooih
Copy link
Contributor Author

hirooih commented Oct 1, 2020

@masatake san,

Thank you for your review.
I'll fix the issues you pointed.

Which way do you prefer?

  1. make one commit for these issue. You can see the difference from code you already reviewed.
  2. squash each fix to the original commit. The change may not be clear for you.
  3. make each commit. Squash them after you reviewed.

I prefer 1 or 2.
1 is easiest. 2 and 3 are almost same for me. This time changes are so trivial that you don't have to review seriously. 2 may be enough.
Anyway I will follow your way.


Let me ask you some questions.

Q1. "//" type comment vs "/* */" type comment
Many other parsers are using "//" style comment, but I found verilog.c has been using only "/* */" style..
I already use it in my local commits. Should I change them to "/* */" style comment?

Q2. nested if-else vs switch case in findTag()
As you see, the nested if-else in findTag() can be replaced with a switch case. I prefer switch case because it gives readers important information that all condition depend on one variable (token->kind). How do you think?


Supporting user-defined type is almost working well on my working directory. I am adding some more tests and make some refactoring. I hope I can commit it this weekend.

@masatake
Copy link
Member

masatake commented Oct 1, 2020

  1. squash each fix to the original commit. The change may not be clear for you.

The original commits are readable. They are short enough so I can understand. Please, keep the way.
However, if you can reflect my review comment to the commit that I mentioned to easily, update the commit.
As I wrote before, git rebase -i may help you.

The original commits convince me you are more than a contributor.
About coding style, do as you want.

About indent-style, see .editorconfig. About C syntax, in most of the areas, we choose stdc99.
Many different people have joined the development of ctags (and left), so the style of coding is not unified.
Ideally, they should be unified. However, I have kept it as is. If a parser has no active maintainer, I adjust the
coding style of the parser as I like. If a parser has an active maintainer, generally, I say nothing; the maintainer
should do as one wants. And, finally, the SystemVerilog parser gets a new maintainer.

If you get a question about coding, asking me is one of the ways. Inspecting other parsers is an alternative way.

@codecov
Copy link

codecov bot commented Oct 2, 2020

Codecov Report

Merging #2653 into master will increase coverage by 0.03%.
The diff coverage is 96.04%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2653      +/-   ##
==========================================
+ Coverage   86.86%   86.90%   +0.03%     
==========================================
  Files         183      183              
  Lines       38973    39016      +43     
==========================================
+ Hits        33855    33906      +51     
+ Misses       5118     5110       -8     
Impacted Files Coverage Δ
parsers/verilog.c 98.79% <96.04%> (+1.58%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 86bdb84...924ef4e. Read the comment docs.

@hirooih
Copy link
Contributor Author

hirooih commented Oct 2, 2020

  1. squash each fix to the original commit. The change may not be clear for you.

I've done this.
The moving the prototype definition is squashed into 7bbbffb.
vString*() fixes were done by 'git --amend'.

I will do other additional refactoring after fixes which I am working will be done.

About coding style, do as you want.

OK. I'll do my best.

@masatake
Copy link
Member

masatake commented Oct 2, 2020

I sent you an invitation to the u-ctags org. Please, merge your changes for yourself after accepting it.
There are some methods for merging in GitHub web UI. If there is no special requirement, I choose the "Merge pull request" method.

@masatake masatake marked this pull request as ready for review October 2, 2020 22:02
@hirooih
Copy link
Contributor Author

hirooih commented Oct 4, 2020

Finally I've committed the fix for user defined type support.
a54e815 is the main fix.

By supporting user defined type correctly some expected.tags must be updated.
I also add some test cases.

@masatake
Copy link
Member

masatake commented Oct 4, 2020

I found a very minor issue.
About a54e815,
' in the second line of the commit log is not closed:

is followed by ',', ':', or ').
-----------------------------^

}
}
vUngetc (c);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if c is not '#', the memory object associated with token leaks.

  1. install valgrind
  2. git checkout a54e815
  3. make
  4. make units LANGUAGES=SystemVerilog VG=1
  5. cat Units/parser-verilog.r/systemverilog-class.d/VALGRIND.tmp
==20931== Memcheck, a memory error detector
==20931== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==20931== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==20931== Command: ./ctags --verbose --options=NONE --optlib-dir=+./Units/parser-verilog.r/systemverilog-class.d/optlib -o - --options=./Units/parser-verilog.r/systemverilog-class.d/args.ctags ./Units/parser-verilog.r/systemverilog-class.d/input.sv
==20931== Parent PID: 19254
==20931== 
==20931== 
==20931== HEAP SUMMARY:
==20931==     in use at exit: 768 bytes in 21 blocks
==20931==   total heap usage: 6,777 allocs, 6,756 frees, 360,392 bytes allocated
==20931== 
==20931== 768 (264 direct, 504 indirect) bytes in 3 blocks are definitely lost in loss record 7 of 7
==20931==    at 0x4C2EE3B: malloc (vg_replace_malloc.c:309)
==20931==    by 0x4149BD: eMalloc (routines.c:220)
==20931==    by 0x471C8E: newToken (verilog.c:380)
==20931==    by 0x472147: processParameterList (verilog.c:1235)
==20931==    by 0x4736A8: processClass (verilog.c:1304)
==20931==    by 0x4736A8: findTag (verilog.c:1577)
==20931==    by 0x4736A8: findVerilogTags (verilog.c:1648)
==20931==    by 0x411292: createTagsForFile (parse.c:3722)
==20931==    by 0x411292: createTagsWithFallback1 (parse.c:3844)
==20931==    by 0x4115E3: createTagsWithFallback (parse.c:3934)
==20931==    by 0x4115E3: parseMio (parse.c:4096)
==20931==    by 0x411746: parseFileWithMio (parse.c:4142)
==20931==    by 0x411873: parseFile (parse.c:4079)
==20931==    by 0x403307: createTagsForEntry (main.c:221)
==20931==    by 0x40364F: createTagsForArgs (main.c:266)
==20931==    by 0x40364F: batchMakeTags (main.c:360)
==20931==    by 0x403AD5: runMainLoop (main.c:332)
==20931==    by 0x403AD5: ctags_cli_main (main.c:585)
==20931== 
==20931== LEAK SUMMARY:
==20931==    definitely lost: 264 bytes in 3 blocks
==20931==    indirectly lost: 504 bytes in 18 blocks
==20931==      possibly lost: 0 bytes in 0 blocks
==20931==    still reachable: 0 bytes in 0 blocks
==20931==         suppressed: 0 bytes in 0 blocks
==20931== 
==20931== For counts of detected and suppressed errors, rerun with: -v
==20931== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GitHub action "run units target under VALGRIND" looks broken. It should report this memory leak. I must fix it.

tagNameList: an identifier is classified as net or variable name when it
is followed by ',', ':', or ')'.

define processParameterList() for both module and class parameter list.

It is based on the code which was included in processClass().
tags are output in the order defined.  (class parameter was in reversed order.)
parens in expression was handled by skipPastMatch().
type with dimenssion is supported
better support of class inheritance

only identifiers followed by ',', ':', or ')' are output as nets or variables

labels are not passed to tagNameLlist() anymore.
inheritance is handled correctly
parameters are output in order defined.

add systemverilog-parameter.d/
add more tests: systemverilog-net-var.d
update expected.tags for user defined type
@hirooih
Copy link
Contributor Author

hirooih commented Oct 4, 2020

Thank you for your prompt reply.

make units LANGUAGES=SystemVerilog VG=1

This is great!

I've fix it and confirm VG=1 test passes.

@masatake
Copy link
Member

masatake commented Oct 5, 2020

BTW, I found an interesting document: https://docs.ctags.io/en/latest/contributions.html#
I cannot remember when I wrote this.

@hirooih hirooih merged commit 9ce3543 into universal-ctags:master Oct 5, 2020
@hirooih
Copy link
Contributor Author

hirooih commented Oct 5, 2020

Please, merge your changes for yourself after accepting it.

I missed this sentence. Thank you for your review and approval.

@masatake
Copy link
Member

masatake commented Oct 5, 2020

Thank you and welcome to the team.

@masatake
Copy link
Member

masatake commented Oct 5, 2020

Please close #1488 and #2413 if the issues are solved with the commits.

@hirooih
Copy link
Contributor Author

hirooih commented Oct 5, 2020

The fixes in this pull request include the following user visible changes;

  • a fork-join creates a context as begin-end does
  • delays like #foo, #'foo, Added php patch #10, etc. are handled correctly
  • multi-dimensional array (ex. [1:0] [7:0])is supported
  • parameter assignment on function definition ignored properly
  • parameter list also support user defined types

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants