Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid the comma operator #1889

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open

Conversation

dscho
Copy link
Member

@dscho dscho commented Mar 24, 2025

The comma operator is rarely used in C anymore, and typically indicates a typo. Just like in these instances, where a semicolon was meant to be used, as there is no need to discard the first statement's result here.

Changes since v2:

  • Made the sed construct in detect-compiler portable (thanks, Eric Sunshine!)
  • The majority of the feedback disagreed with the more compact format in diff-delta.c, so I changed it to the long format (thanks, Phillip Wood!)
  • The more succinct and safer, but less readable, cast in the loop condition of the dowild() function was replaced with the goto-based alternative I had mentioned as a possibility in the commit message (thanks, Phillip Wood!)
  • I adjusted the style of my compat/regex/ patch to the surrounding code's.
  • The -Wcomma option is now used in Meson-based clang builds, too (thanks, Patrick Steinhardt!)

Changes since v1:

  • Use -Wcomma when compiling with clang and with DEVELOPER=1.
  • Address the remaining instances pointed out by clang (and by Phillip).

cc: Philip Oakley [email protected]
cc: Patrick Steinhardt [email protected]
cc: Phillip Wood [email protected]
cc: Karthik Nayak [email protected]
cc: Jeff King [email protected]
cc: Taylor Blau [email protected]
cc: Eric Sunshine [email protected]
cc: Chris Torek [email protected]

@dscho dscho self-assigned this Mar 24, 2025
@dscho
Copy link
Member Author

dscho commented Mar 25, 2025

/submit

Copy link

gitgitgadget bot commented Mar 25, 2025

Submitted as [email protected]

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1889/dscho/comma-operator-v1

To fetch this version to local tag pr-1889/dscho/comma-operator-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1889/dscho/comma-operator-v1

Copy link

gitgitgadget bot commented Mar 25, 2025

On the Git mailing list, Philip Oakley wrote (reply to this):

On 25/03/2025 08:01, Johannes Schindelin via GitGitGadget wrote:
> The comma operator
> [https://en.cppreference.com/w/c/language/operator_other#Comma_operator] is
> rarely used in C anymore, and typically indicates a typo. Just like in these
> instances, where a semicolon was meant to be used, as there is no need to
> discard the first statement's result here.

Minor aside: How were these 'discovered'?

> 
> Johannes Schindelin (2):
>   remote-curl: avoid using the comma operator unnecessarily
>   rebase: avoid using the comma operator unnecessarily
> 
>  builtin/rebase.c | 2 +-
>  remote-curl.c    | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> 
> base-commit: 683c54c999c301c2cd6f715c411407c413b1d84e
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1889%2Fdscho%2Fcomma-operator-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1889/dscho/comma-operator-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1889

Philip

Copy link

gitgitgadget bot commented Mar 25, 2025

User Philip Oakley <[email protected]> has been added to the cc: list.

Copy link

gitgitgadget bot commented Mar 25, 2025

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Tue, Mar 25, 2025 at 08:01:48AM +0000, Johannes Schindelin via GitGitGadget wrote:
> The comma operator
> [https://en.cppreference.com/w/c/language/operator_other#Comma_operator] is
> rarely used in C anymore, and typically indicates a typo. Just like in these
> instances, where a semicolon was meant to be used, as there is no need to
> discard the first statement's result here.

The changes look obviously good to me, thanks. The reftable library and
backend also had several instances where the operator was used by
accident, and I've gotten rid of those over time. They typically don't
do any harm as the result is essentially the same, but sometimes they
may cause issues. And at the very least they cause confusion.

It would be great if there was a compiler warning we could enable for
cases where the operator likely isn't intentional. But I couldn't find
any, unfortunately.

Thanks!

Patrick

Copy link

gitgitgadget bot commented Mar 25, 2025

User Patrick Steinhardt <[email protected]> has been added to the cc: list.

Copy link

gitgitgadget bot commented Mar 25, 2025

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi Philip,

On Tue, 25 Mar 2025, Philip Oakley wrote:

> On 25/03/2025 08:01, Johannes Schindelin via GitGitGadget wrote:
> > The comma operator
> > [https://en.cppreference.com/w/c/language/operator_other#Comma_operator] is
> > rarely used in C anymore, and typically indicates a typo. Just like in these
> > instances, where a semicolon was meant to be used, as there is no need to
> > discard the first statement's result here.
>
> Minor aside: How were these 'discovered'?

I am working on a GitHub workflow that uses CodeQL to find such issues,
that's how I found them. (I also worked with the CodeQL team to get this
query added, way back when I was still working at GitHub.)

Ciao,
Johannes

Copy link

gitgitgadget bot commented Mar 25, 2025

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi Patrick,

On Tue, 25 Mar 2025, Patrick Steinhardt wrote:

> On Tue, Mar 25, 2025 at 08:01:48AM +0000, Johannes Schindelin via GitGitGadget wrote:
> > The comma operator
> > [https://en.cppreference.com/w/c/language/operator_other#Comma_operator] is
> > rarely used in C anymore, and typically indicates a typo. Just like in these
> > instances, where a semicolon was meant to be used, as there is no need to
> > discard the first statement's result here.
>
> The changes look obviously good to me, thanks. The reftable library and
> backend also had several instances where the operator was used by
> accident, and I've gotten rid of those over time. They typically don't
> do any harm as the result is essentially the same, but sometimes they
> may cause issues. And at the very least they cause confusion.

Thank you for addressing these in the reftable library!

> It would be great if there was a compiler warning we could enable for
> cases where the operator likely isn't intentional. But I couldn't find
> any, unfortunately.

I was not actually planning on adding the CodeQL workflow to git/git,
seeing as its CI is already taking way too much CPU time for my liking.
But in `microsoft/git`, I am kind of required to, so we'll catch those
issues there.

Ciao,
Johannes

@@ -1843,7 +1843,7 @@ int cmd_rebase(int argc,
strbuf_addf(&msg, "%s (start): checkout %s",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Phillip Wood wrote (reply to this):

On 25/03/2025 08:01, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <[email protected]>
> > diff --git a/builtin/rebase.c b/builtin/rebase.c
> index d4715ed35d7..62bdf7276f7 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -1843,7 +1843,7 @@ int cmd_rebase(int argc,
>   	strbuf_addf(&msg, "%s (start): checkout %s",
>   		    options.reflog_action, options.onto_name);
>   	ropts.oid = &options.onto->object.oid;
> -	ropts.orig_head = &options.orig_head->object.oid,
> +	ropts.orig_head = &options.orig_head->object.oid;

This is obviously a typo - thanks for fixing it

Best Wishes

Phillip

>   	ropts.flags = RESET_HEAD_DETACH | RESET_ORIG_HEAD |
>   			RESET_HEAD_RUN_POST_CHECKOUT_HOOK;
>   	ropts.head_msg = msg.buf;

Copy link

gitgitgadget bot commented Mar 25, 2025

User Phillip Wood <[email protected]> has been added to the cc: list.

Copy link

gitgitgadget bot commented Mar 25, 2025

On the Git mailing list, Karthik Nayak wrote (reply to this):

Patrick Steinhardt <[email protected]> writes:

> On Tue, Mar 25, 2025 at 08:01:48AM +0000, Johannes Schindelin via GitGitGadget wrote:
>> The comma operator
>> [https://en.cppreference.com/w/c/language/operator_other#Comma_operator] is
>> rarely used in C anymore, and typically indicates a typo. Just like in these
>> instances, where a semicolon was meant to be used, as there is no need to
>> discard the first statement's result here.
>
> The changes look obviously good to me, thanks. The reftable library and
> backend also had several instances where the operator was used by
> accident, and I've gotten rid of those over time. They typically don't
> do any harm as the result is essentially the same, but sometimes they
> may cause issues. And at the very least they cause confusion.
>
> It would be great if there was a compiler warning we could enable for
> cases where the operator likely isn't intentional. But I couldn't find
> any, unfortunately.
>

Clang does have '-Wcomma' [1], which seems to be exactly what we want.

[1]: https://clang.llvm.org/docs/DiagnosticsReference.html#wcomma

> Thanks!
>
> Patrick

Copy link

gitgitgadget bot commented Mar 25, 2025

User Karthik Nayak <[email protected]> has been added to the cc: list.

@@ -1401,7 +1401,7 @@ static int push_git(struct discovery *heads, int nr_spec, const char **specs)
packet_buf_flush(&preamble);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Phillip Wood wrote (reply to this):

Hi Johannes

On 25/03/2025 08:01, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <[email protected]>
> > The comma operator is a somewhat obscure C feature that is often used by
> mistake and can even cause unintentional code flow. Better use a
> semicolon instead.

clang's -Wcomma finds another instance in this file

@@ -1239,7 +1239,7 @@ static int fetch_git(struct discovery *heads,
 	packet_buf_flush(&preamble);

 	memset(&rpc, 0, sizeof(rpc));
-	rpc.service_name = "git-upload-pack",
+	rpc.service_name = "git-upload-pack";
 	rpc.gzip_request = 1;

 	err = rpc_service(&rpc, heads, args.v, &preamble, &rpc_result);

In fact it finds a surprising number in our code base. I was worried there would be a lot of false positives but so far all of the ones I've looked at would be better not using a ","

Best Wishes

Phillip

> Signed-off-by: Johannes Schindelin <[email protected]>
> ---
>   remote-curl.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> > diff --git a/remote-curl.c b/remote-curl.c
> index 1273507a96c..57b515b37e7 100644
> --- a/remote-curl.c
> +++ b/remote-curl.c
> @@ -1401,7 +1401,7 @@ static int push_git(struct discovery *heads, int nr_spec, const char **specs)
>   	packet_buf_flush(&preamble);
>   >   	memset(&rpc, 0, sizeof(rpc));
> -	rpc.service_name = "git-receive-pack",
> +	rpc.service_name = "git-receive-pack";
>   >   	err = rpc_service(&rpc, heads, args.v, &preamble, &rpc_result);
>   	if (rpc_result.len)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Jeff King wrote (reply to this):

On Tue, Mar 25, 2025 at 04:28:32PM +0000, Phillip Wood wrote:

> On 25/03/2025 08:01, Johannes Schindelin via GitGitGadget wrote:
> > From: Johannes Schindelin <[email protected]>
> > 
> > The comma operator is a somewhat obscure C feature that is often used by
> > mistake and can even cause unintentional code flow. Better use a
> > semicolon instead.
> 
> clang's -Wcomma finds another instance in this file
> 
> @@ -1239,7 +1239,7 @@ static int fetch_git(struct discovery *heads,
>  	packet_buf_flush(&preamble);
> 
>  	memset(&rpc, 0, sizeof(rpc));
> -	rpc.service_name = "git-upload-pack",
> +	rpc.service_name = "git-upload-pack";
>  	rpc.gzip_request = 1;
> 
>  	err = rpc_service(&rpc, heads, args.v, &preamble, &rpc_result);
> 
> In fact it finds a surprising number in our code base. I was worried there
> would be a lot of false positives but so far all of the ones I've looked at
> would be better not using a ","

It looks like there are a few tricky cases. Inside a loop condition,
a comma can be used to smuggle in an extra line. E.g., in wildmatch:

  do {
    ...
  } while (prev_ch = p_ch, (p_ch = *++p) != ']');

or there are a few spots in clar like:

  while ((d = (errno = 0, readdir(source_dir))) != NULL) {
    ...
  }
  cl_assert_(errno == 0, "Failed to iterate source dir");

These could probably be re-written, but it's not a totally trivial
change.

The rest of them seem pretty straight-forward (though you do have to
watch out for conditionals using it to stuff multiple lines into a
single "statement"):

diff --git a/builtin/rebase.c b/builtin/rebase.c
index d4715ed35d..62bdf7276f 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -1843,7 +1843,7 @@ int cmd_rebase(int argc,
 	strbuf_addf(&msg, "%s (start): checkout %s",
 		    options.reflog_action, options.onto_name);
 	ropts.oid = &options.onto->object.oid;
-	ropts.orig_head = &options.orig_head->object.oid,
+	ropts.orig_head = &options.orig_head->object.oid;
 	ropts.flags = RESET_HEAD_DETACH | RESET_ORIG_HEAD |
 			RESET_HEAD_RUN_POST_CHECKOUT_HOOK;
 	ropts.head_msg = msg.buf;
diff --git a/diff-delta.c b/diff-delta.c
index a4faf73829..71d37368d6 100644
--- a/diff-delta.c
+++ b/diff-delta.c
@@ -438,19 +438,31 @@ create_delta(const struct delta_index *index,
 			op = out + outpos++;
 			i = 0x80;
 
-			if (moff & 0x000000ff)
-				out[outpos++] = moff >> 0,  i |= 0x01;
-			if (moff & 0x0000ff00)
-				out[outpos++] = moff >> 8,  i |= 0x02;
-			if (moff & 0x00ff0000)
-				out[outpos++] = moff >> 16, i |= 0x04;
-			if (moff & 0xff000000)
-				out[outpos++] = moff >> 24, i |= 0x08;
-
-			if (msize & 0x00ff)
-				out[outpos++] = msize >> 0, i |= 0x10;
-			if (msize & 0xff00)
-				out[outpos++] = msize >> 8, i |= 0x20;
+			if (moff & 0x000000ff) {
+				out[outpos++] = moff >> 0;
+				i |= 0x01;
+			}
+			if (moff & 0x0000ff00) {
+				out[outpos++] = moff >> 8;
+				i |= 0x02;
+			}
+			if (moff & 0x00ff0000) {
+				out[outpos++] = moff >> 16;
+				i |= 0x04;
+			}
+			if (moff & 0xff000000) {
+				out[outpos++] = moff >> 24;
+				i |= 0x08;
+			}
+
+			if (msize & 0x00ff) {
+				out[outpos++] = msize >> 0;
+				i |= 0x10;
+			}
+			if (msize & 0xff00) {
+				out[outpos++] = msize >> 8;
+				i |= 0x20;
+			}
 
 			*op = i;
 
diff --git a/kwset.c b/kwset.c
index 1714eada60..23ab015448 100644
--- a/kwset.c
+++ b/kwset.c
@@ -197,10 +197,14 @@ kwsincr (kwset_t kws, char const *text, size_t len)
       while (link && label != link->label)
 	{
 	  links[depth] = link;
-	  if (label < link->label)
-	    dirs[depth++] = L, link = link->llink;
-	  else
-	    dirs[depth++] = R, link = link->rlink;
+	  if (label < link->label) {
+	    dirs[depth++] = L;
+	    link = link->llink;
+	  }
+	  else {
+	    dirs[depth++] = R;
+	    link = link->rlink;
+	  }
 	}
 
       /* The current character doesn't have an outgoing link at
@@ -257,14 +261,14 @@ kwsincr (kwset_t kws, char const *text, size_t len)
 		  switch (dirs[depth + 1])
 		    {
 		    case L:
-		      r = links[depth], t = r->llink, rl = t->rlink;
-		      t->rlink = r, r->llink = rl;
+		      r = links[depth]; t = r->llink; rl = t->rlink;
+		      t->rlink = r; r->llink = rl;
 		      t->balance = r->balance = 0;
 		      break;
 		    case R:
-		      r = links[depth], l = r->llink, t = l->rlink;
-		      rl = t->rlink, lr = t->llink;
-		      t->llink = l, l->rlink = lr, t->rlink = r, r->llink = rl;
+		      r = links[depth]; l = r->llink; t = l->rlink;
+		      rl = t->rlink; lr = t->llink;
+		      t->llink = l; l->rlink = lr; t->rlink = r; r->llink = rl;
 		      l->balance = t->balance != 1 ? 0 : -1;
 		      r->balance = t->balance != (char) -1 ? 0 : 1;
 		      t->balance = 0;
@@ -277,14 +281,14 @@ kwsincr (kwset_t kws, char const *text, size_t len)
 		  switch (dirs[depth + 1])
 		    {
 		    case R:
-		      l = links[depth], t = l->rlink, lr = t->llink;
-		      t->llink = l, l->rlink = lr;
+		      l = links[depth]; t = l->rlink; lr = t->llink;
+		      t->llink = l; l->rlink = lr;
 		      t->balance = l->balance = 0;
 		      break;
 		    case L:
-		      l = links[depth], r = l->rlink, t = r->llink;
-		      lr = t->llink, rl = t->rlink;
-		      t->llink = l, l->rlink = lr, t->rlink = r, r->llink = rl;
+		      l = links[depth]; r = l->rlink; t = r->llink;
+		      lr = t->llink; rl = t->rlink;
+		      t->llink = l; l->rlink = lr; t->rlink = r; r->llink = rl;
 		      l->balance = t->balance != 1 ? 0 : -1;
 		      r->balance = t->balance != (char) -1 ? 0 : 1;
 		      t->balance = 0;
@@ -567,22 +571,22 @@ bmexec (kwset_t kws, char const *text, size_t size)
       {
 	while (tp <= ep)
 	  {
-	    d = d1[U(tp[-1])], tp += d;
-	    d = d1[U(tp[-1])], tp += d;
+	    d = d1[U(tp[-1])]; tp += d;
+	    d = d1[U(tp[-1])]; tp += d;
 	    if (d == 0)
 	      goto found;
-	    d = d1[U(tp[-1])], tp += d;
-	    d = d1[U(tp[-1])], tp += d;
-	    d = d1[U(tp[-1])], tp += d;
+	    d = d1[U(tp[-1])]; tp += d;
+	    d = d1[U(tp[-1])]; tp += d;
+	    d = d1[U(tp[-1])]; tp += d;
 	    if (d == 0)
 	      goto found;
-	    d = d1[U(tp[-1])], tp += d;
-	    d = d1[U(tp[-1])], tp += d;
-	    d = d1[U(tp[-1])], tp += d;
+	    d = d1[U(tp[-1])]; tp += d;
+	    d = d1[U(tp[-1])]; tp += d;
+	    d = d1[U(tp[-1])]; tp += d;
 	    if (d == 0)
 	      goto found;
-	    d = d1[U(tp[-1])], tp += d;
-	    d = d1[U(tp[-1])], tp += d;
+	    d = d1[U(tp[-1])]; tp += d;
+	    d = d1[U(tp[-1])]; tp += d;
 	  }
 	break;
       found:
@@ -649,7 +653,7 @@ cwexec (kwset_t kws, char const *text, size_t len, struct kwsmatch *kwsmatch)
     mch = NULL;
   else
     {
-      mch = text, accept = kwset->trie;
+      mch = text; accept = kwset->trie;
       goto match;
     }
 
diff --git a/remote-curl.c b/remote-curl.c
index 1273507a96..590b228f67 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -1239,7 +1239,7 @@ static int fetch_git(struct discovery *heads,
 	packet_buf_flush(&preamble);
 
 	memset(&rpc, 0, sizeof(rpc));
-	rpc.service_name = "git-upload-pack",
+	rpc.service_name = "git-upload-pack";
 	rpc.gzip_request = 1;
 
 	err = rpc_service(&rpc, heads, args.v, &preamble, &rpc_result);
@@ -1401,7 +1401,7 @@ static int push_git(struct discovery *heads, int nr_spec, const char **specs)
 	packet_buf_flush(&preamble);
 
 	memset(&rpc, 0, sizeof(rpc));
-	rpc.service_name = "git-receive-pack",
+	rpc.service_name = "git-receive-pack";
 
 	err = rpc_service(&rpc, heads, args.v, &preamble, &rpc_result);
 	if (rpc_result.len)
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 8889b8b62a..5a96e36dfb 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -211,8 +211,10 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
 			for (d = fmax; d >= fmin; d -= 2) {
 				i1 = XDL_MIN(kvdf[d], lim1);
 				i2 = i1 - d;
-				if (lim2 < i2)
-					i1 = lim2 + d, i2 = lim2;
+				if (lim2 < i2) {
+					i1 = lim2 + d;
+					i2 = lim2;
+				}
 				if (fbest < i1 + i2) {
 					fbest = i1 + i2;
 					fbest1 = i1;
@@ -223,8 +225,10 @@ static long xdl_split(unsigned long const *ha1, long off1, long lim1,
 			for (d = bmax; d >= bmin; d -= 2) {
 				i1 = XDL_MAX(off1, kvdb[d]);
 				i2 = i1 - d;
-				if (i2 < off2)
-					i1 = off2 + d, i2 = off2;
+				if (i2 < off2) {
+					i1 = off2 + d;
+					i2 = off2;
+				}
 				if (i1 + i2 < bbest) {
 					bbest = i1 + i2;
 					bbest1 = i1;

-Peff

Copy link

gitgitgadget bot commented Mar 25, 2025

User Jeff King <[email protected]> has been added to the cc: list.

Copy link

gitgitgadget bot commented Mar 25, 2025

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Johannes Schindelin via GitGitGadget" <[email protected]>
writes:

> The comma operator
> [https://en.cppreference.com/w/c/language/operator_other#Comma_operator] is
> rarely used in C anymore, and typically indicates a typo. Just like in these
> instances, where a semicolon was meant to be used, as there is no need to
> discard the first statement's result here.
>
> Johannes Schindelin (2):
>   remote-curl: avoid using the comma operator unnecessarily
>   rebase: avoid using the comma operator unnecessarily

Well spotted.

These two looked somehow surprisingly bad.

If I hadn't known better, I may have spent quite some time wondering
if these are some ways to hide an unexpected behaviour behind the
differences between a comma and a semicolon for nefarious purposes.

Will queue.  Thanks.

>
>  builtin/rebase.c | 2 +-
>  remote-curl.c    | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
>
> base-commit: 683c54c999c301c2cd6f715c411407c413b1d84e
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1889%2Fdscho%2Fcomma-operator-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1889/dscho/comma-operator-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1889

dscho added 5 commits March 25, 2025 22:11
The comma operator is a somewhat obscure C feature that is often used by
mistake and can even cause unintentional code flow. Better use a
semicolon instead.

Signed-off-by: Johannes Schindelin <[email protected]>
The comma operator is a somewhat obscure C feature that is often used by
mistake and can even cause unintentional code flow. Better use a
semicolon instead.

Signed-off-by: Johannes Schindelin <[email protected]>
The comma operator is a somewhat obscure C feature that is often used by
mistake and can even cause unintentional code flow. Better use a
semicolon instead.

Signed-off-by: Johannes Schindelin <[email protected]>
The comma operator is a somewhat obscure C feature that is often used by
mistake and can even cause unintentional code flow. In this instance, it
makes the code harder to read than necessary, too. Better use a
semicolon instead.

Signed-off-by: Johannes Schindelin <[email protected]>
The comma operator is a somewhat obscure C feature that is often used by
mistake and can even cause unintentional code flow. While the code in
this patch used the comma operator intentionally (to avoid curly
brackets around two statements, each, that want to be guarded by a
condition), it is better to surround it with curly brackets and to use a
semicolon instead.

Signed-off-by: Johannes Schindelin <[email protected]>
@dscho dscho force-pushed the comma-operator branch 2 times, most recently from 52d0e2a to 2f6f312 Compare March 25, 2025 21:54
@dscho
Copy link
Member Author

dscho commented Mar 25, 2025

/submit

Copy link

gitgitgadget bot commented Mar 25, 2025

Submitted as [email protected]

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1889/dscho/comma-operator-v2

To fetch this version to local tag pr-1889/dscho/comma-operator-v2:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1889/dscho/comma-operator-v2

@@ -376,9 +376,12 @@ fs_copydir_helper(const char *source, const char *dest, int dest_mode)
mkdir(dest, dest_mode);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Tue, Mar 25, 2025 at 11:32:08PM +0000, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <[email protected]>
> 
> The comma operator is a somewhat obscure C feature that is often used by
> mistake and can even cause unintentional code flow. In this instance, it
> makes the code harder to read than necessary, too. Better use a
> semicolon instead.

This code has changed upstream already, but let's roll with your change
anyway. I plan to update the clar to the upstream version soonish once I
have landed integer comparisons, and will take care that there aren't
any other comment operators left.

Patrick

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi Patrick,

On Wed, 26 Mar 2025, Patrick Steinhardt wrote:

> On Tue, Mar 25, 2025 at 11:32:08PM +0000, Johannes Schindelin via GitGitGadget wrote:
> > From: Johannes Schindelin <[email protected]>
> >
> > The comma operator is a somewhat obscure C feature that is often used by
> > mistake and can even cause unintentional code flow. In this instance, it
> > makes the code harder to read than necessary, too. Better use a
> > semicolon instead.
>
> This code has changed upstream already, but let's roll with your change
> anyway. I plan to update the clar to the upstream version soonish once I
> have landed integer comparisons, and will take care that there aren't
> any other comment operators left.

Thank you for putting so much energy into improving the tests, I really
appreciate it!

Ciao,
Johannes

@@ -439,18 +439,18 @@ create_delta(const struct delta_index *index,
i = 0x80;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Tue, Mar 25, 2025 at 11:32:10PM +0000, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <[email protected]>
> 
> The comma operator is a somewhat obscure C feature that is often used by
> mistake and can even cause unintentional code flow. That is why the
> `-Wcomma` option of clang was introduced: To identify unintentional uses
> of the comma operator.
> 
> Intentional uses include situations where one wants to avoid curly
> brackets around multiple statements that need to be guarded by a
> condition. This is the case here, as the repetitive nature of the
> statements is easier to see for a human reader this way.
> 
> To mark this usage as intentional, the return value of the statement
> before the comma needs to be cast to `void`, which we do here.
> 
> Signed-off-by: Johannes Schindelin <[email protected]>
> ---
>  diff-delta.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/diff-delta.c b/diff-delta.c
> index a4faf73829b..a03ba10b2be 100644
> --- a/diff-delta.c
> +++ b/diff-delta.c
> @@ -439,18 +439,18 @@ create_delta(const struct delta_index *index,
>  			i = 0x80;
>  
>  			if (moff & 0x000000ff)
> -				out[outpos++] = moff >> 0,  i |= 0x01;
> +				(void)(out[outpos++] = moff >> 0),  i |= 0x01;
>  			if (moff & 0x0000ff00)
> -				out[outpos++] = moff >> 8,  i |= 0x02;
> +				(void)(out[outpos++] = moff >> 8),  i |= 0x02;
>  			if (moff & 0x00ff0000)
> -				out[outpos++] = moff >> 16, i |= 0x04;
> +				(void)(out[outpos++] = moff >> 16), i |= 0x04;
>  			if (moff & 0xff000000)
> -				out[outpos++] = moff >> 24, i |= 0x08;
> +				(void)(out[outpos++] = moff >> 24), i |= 0x08;
>  
>  			if (msize & 0x00ff)
> -				out[outpos++] = msize >> 0, i |= 0x10;
> +				(void)(out[outpos++] = msize >> 0), i |= 0x10;
>  			if (msize & 0xff00)
> -				out[outpos++] = msize >> 8, i |= 0x20;
> +				(void)(out[outpos++] = msize >> 8), i |= 0x20;

Hm. I think the end result is even more confusing than before. Why don't
we introduce curly braces here, same as in preceding commits?

Patrick

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi Patrick,

On Wed, 26 Mar 2025, Patrick Steinhardt wrote:

> On Tue, Mar 25, 2025 at 11:32:10PM +0000, Johannes Schindelin via GitGitGadget wrote:
> > From: Johannes Schindelin <[email protected]>
> >
> > The comma operator is a somewhat obscure C feature that is often used by
> > mistake and can even cause unintentional code flow. That is why the
> > `-Wcomma` option of clang was introduced: To identify unintentional uses
> > of the comma operator.
> >
> > Intentional uses include situations where one wants to avoid curly
> > brackets around multiple statements that need to be guarded by a
> > condition. This is the case here, as the repetitive nature of the
> > statements is easier to see for a human reader this way.
> >
> > To mark this usage as intentional, the return value of the statement
> > before the comma needs to be cast to `void`, which we do here.
> >
> > Signed-off-by: Johannes Schindelin <[email protected]>
> > ---
> >  diff-delta.c | 12 ++++++------
> >  1 file changed, 6 insertions(+), 6 deletions(-)
> >
> > diff --git a/diff-delta.c b/diff-delta.c
> > index a4faf73829b..a03ba10b2be 100644
> > --- a/diff-delta.c
> > +++ b/diff-delta.c
> > @@ -439,18 +439,18 @@ create_delta(const struct delta_index *index,
> >  			i = 0x80;
> >
> >  			if (moff & 0x000000ff)
> > -				out[outpos++] = moff >> 0,  i |= 0x01;
> > +				(void)(out[outpos++] = moff >> 0),  i |= 0x01;
> >  			if (moff & 0x0000ff00)
> > -				out[outpos++] = moff >> 8,  i |= 0x02;
> > +				(void)(out[outpos++] = moff >> 8),  i |= 0x02;
> >  			if (moff & 0x00ff0000)
> > -				out[outpos++] = moff >> 16, i |= 0x04;
> > +				(void)(out[outpos++] = moff >> 16), i |= 0x04;
> >  			if (moff & 0xff000000)
> > -				out[outpos++] = moff >> 24, i |= 0x08;
> > +				(void)(out[outpos++] = moff >> 24), i |= 0x08;
> >
> >  			if (msize & 0x00ff)
> > -				out[outpos++] = msize >> 0, i |= 0x10;
> > +				(void)(out[outpos++] = msize >> 0), i |= 0x10;
> >  			if (msize & 0xff00)
> > -				out[outpos++] = msize >> 8, i |= 0x20;
> > +				(void)(out[outpos++] = msize >> 8), i |= 0x20;
>
> Hm. I think the end result is even more confusing than before. Why don't
> we introduce curly braces here, same as in preceding commits?

The interleaved -/+ lines make it admittedly hard to see what I meant.
I'll unwind it a bit (presenting only the `moff` part, the same
consideration applies to `msize`):

		if (moff & 0x000000ff)
			(void)(out[outpos++] = moff >> 0),  i |= 0x01;
		if (moff & 0x0000ff00)
			(void)(out[outpos++] = moff >> 8),  i |= 0x02;
		if (moff & 0x00ff0000)
			(void)(out[outpos++] = moff >> 16), i |= 0x04;
		if (moff & 0xff000000)
			(void)(out[outpos++] = moff >> 24), i |= 0x08;

In this form, it is very obvious (from comparing the right-side half of
the lines) that a shifted version of `moff` is appended to `out` and `i`
gets a bit set, and the correlation between shift width and the set bit
is relatively easy to see and validate (at least my brain has an easy time
here, thanks to the alignment and thanks to visual similarity between the
non-blank parts).

It is admittedly quite a bit harder not to be distracted by the repetitive
`(void)(out[...` parts to understand and validate the `if` conditions on
the left-hand side, but thanks to those repetitive parts being identical,
and being only one line between those `if` lines, I can bring my brain to
focus only on the differences of the bitmask and understand and verify
them with relatively little effort.

When I compared this form to the following, the cognitive load to wrap my
head around the code is quite a bit higher there:

		if (moff & 0x000000ff) {
			out[outpos++] = moff >> 0;
			i |= 0x01;
		}
		if (moff & 0x0000ff00) {
			out[outpos++] = moff >> 8;
			i |= 0x02;
		}
		if (moff & 0x00ff0000) {
			out[outpos++] = moff >> 16;
			i |= 0x04;
		}
		if (moff & 0xff000000) {
			out[outpos++] = moff >> 24;
			i |= 0x08;
		}

The reason is the visual distance between the near-identical code.

Having said that, I do realize that my brain quite possibly works in
special ways here and that the general preference is to go with the latter
form.

Do you have a strong opinion which form to use?

Ciao,
Johannes

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Phillip Wood wrote (reply to this):

Hi Johannes

On 26/03/2025 07:20, Johannes Schindelin wrote:
> Hi Patrick,
> On Wed, 26 Mar 2025, Patrick Steinhardt wrote:
>> Hm. I think the end result is even more confusing than before. Why don't
>> we introduce curly braces here, same as in preceding commits?
> > The interleaved -/+ lines make it admittedly hard to see what I meant.
> I'll unwind it a bit (presenting only the `moff` part, the same
> consideration applies to `msize`):
> > 		if (moff & 0x000000ff)
> 			(void)(out[outpos++] = moff >> 0),  i |= 0x01;
> 		if (moff & 0x0000ff00)
> 			(void)(out[outpos++] = moff >> 8),  i |= 0x02;
> 		if (moff & 0x00ff0000)
> 			(void)(out[outpos++] = moff >> 16), i |= 0x04;
> 		if (moff & 0xff000000)
> 			(void)(out[outpos++] = moff >> 24), i |= 0x08;
> > In this form, it is very obvious (from comparing the right-side half of
> the lines) that a shifted version of `moff` is appended to `out` and `i`
> gets a bit set, and the correlation between shift width and the set bit
> is relatively easy to see and validate (at least my brain has an easy time
> here, thanks to the alignment and thanks to visual similarity between the
> non-blank parts).
> > It is admittedly quite a bit harder not to be distracted by the repetitive
> `(void)(out[...` parts to understand and validate the `if` conditions on
> the left-hand side,

That makes it pretty unreadable for me. If you're worried about the vertical space we could perhaps keep both statements on a single line so that we're only adding a single newline for the closing brace rather than two.

Best Wishes

Phillip

 but thanks to those repetitive parts being identical,
> and being only one line between those `if` lines, I can bring my brain to
> focus only on the differences of the bitmask and understand and verify
> them with relatively little effort.
> > When I compared this form to the following, the cognitive load to wrap my
> head around the code is quite a bit higher there:
> > 		if (moff & 0x000000ff) {
> 			out[outpos++] = moff >> 0;
> 			i |= 0x01;
> 		}
> 		if (moff & 0x0000ff00) {
> 			out[outpos++] = moff >> 8;
> 			i |= 0x02;
> 		}
> 		if (moff & 0x00ff0000) {
> 			out[outpos++] = moff >> 16;
> 			i |= 0x04;
> 		}
> 		if (moff & 0xff000000) {
> 			out[outpos++] = moff >> 24;
> 			i |= 0x08;
> 		}
> > The reason is the visual distance between the near-identical code.
> > Having said that, I do realize that my brain quite possibly works in
> special ways here and that the general preference is to go with the latter
> form.
> > Do you have a strong opinion which form to use?
> > Ciao,
> Johannes
> 

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Taylor Blau wrote (reply to this):

On Wed, Mar 26, 2025 at 10:17:50AM +0000, Phillip Wood wrote:
> Hi Johannes
>
> On 26/03/2025 07:20, Johannes Schindelin wrote:
> > Hi Patrick,
> > On Wed, 26 Mar 2025, Patrick Steinhardt wrote:
> > > Hm. I think the end result is even more confusing than before. Why don't
> > > we introduce curly braces here, same as in preceding commits?
> >
> > The interleaved -/+ lines make it admittedly hard to see what I meant.
> > I'll unwind it a bit (presenting only the `moff` part, the same
> > consideration applies to `msize`):
> >
> > 		if (moff & 0x000000ff)
> > 			(void)(out[outpos++] = moff >> 0),  i |= 0x01;
> > 		if (moff & 0x0000ff00)
> > 			(void)(out[outpos++] = moff >> 8),  i |= 0x02;
> > 		if (moff & 0x00ff0000)
> > 			(void)(out[outpos++] = moff >> 16), i |= 0x04;
> > 		if (moff & 0xff000000)
> > 			(void)(out[outpos++] = moff >> 24), i |= 0x08;
> >
> > In this form, it is very obvious (from comparing the right-side half of
> > the lines) that a shifted version of `moff` is appended to `out` and `i`
> > gets a bit set, and the correlation between shift width and the set bit
> > is relatively easy to see and validate (at least my brain has an easy time
> > here, thanks to the alignment and thanks to visual similarity between the
> > non-blank parts).
> >
> > It is admittedly quite a bit harder not to be distracted by the repetitive
> > `(void)(out[...` parts to understand and validate the `if` conditions on
> > the left-hand side,
>
> That makes it pretty unreadable for me. If you're worried about the vertical
> space we could perhaps keep both statements on a single line so that we're
> only adding a single newline for the closing brace rather than two.

My $.02 here would be that the form:

    if (moff & 0x000000ff) {
        out[outpos++] = moff >> 0;
        i |= 0x01;
    }

is the most readable, and fits within the conventions of our
CodingGuidelines. I don't really love the idea of writing:

    if (moff & 0x000000ff) {
        out[outpos++] = moff >> 0); i |= 0x01;
    }

, since it looks like a single-line if-statement and is thus tempting to
drop the braces, which would make the code incorrect, as 'i |= 0x01'
would be executed unconditionally. I suppose you could write

    if (moff & 0x000000ff) {
        out[outpos++] = moff >> 0); i |= 0x01; }

, but that feels even uglier to me ;-).

Thanks,
Taylor

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Chris Torek wrote (reply to this):

(Just picking the latest message to reply to, not really quite right)

> > >             if (moff & 0x000000ff)
> > >                     (void)(out[outpos++] = moff >> 0),  i |= 0x01;
> > >             if (moff & 0x0000ff00)
> > >                     (void)(out[outpos++] = moff >> 8),  i |= 0x02;
> > >             if (moff & 0x00ff0000)
> > >                     (void)(out[outpos++] = moff >> 16), i |= 0x04;
> > >             if (moff & 0xff000000)
> > >                     (void)(out[outpos++] = moff >> 24), i |= 0x08;

Might be overkill but:

        #define XXX(index) do { \
                if (moff & (0xffUL << ((index) * 8))) {
                        out[outpos++] = moff >> ((index) * 8);
                        i |= 1 << (index);
                }
        } while (0)

        XXX(0);
        XXX(1);
        XXX(2);
        XXX(3);

        #undef XXX

would do the trick.  Pick a proper name for XXX of course.

Chris

@@ -268,7 +268,7 @@ static int dowild(const uchar *p, const uchar *text, unsigned int flags)
p_ch = 0; /* This makes "prev_ch" get set to 0. */
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Tue, Mar 25, 2025 at 11:32:11PM +0000, Johannes Schindelin via GitGitGadget wrote:
> diff --git a/wildmatch.c b/wildmatch.c
> index 8ea29141bd7..ce8108a6d57 100644
> --- a/wildmatch.c
> +++ b/wildmatch.c
> @@ -268,7 +268,7 @@ static int dowild(const uchar *p, const uchar *text, unsigned int flags)
>  					p_ch = 0; /* This makes "prev_ch" get set to 0. */
>  				} else if (t_ch == p_ch)
>  					matched = 1;
> -			} while (prev_ch = p_ch, (p_ch = *++p) != ']');
> +			} while ((void)(prev_ch = p_ch), (p_ch = *++p) != ']');
>  			if (matched == negated ||
>  			    ((flags & WM_PATHNAME) && t_ch == '/'))
>  				return WM_NOMATCH;

In this case I agree that it makes sense to not introduce curly braces
for brevity.

Patrick

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi Patrick,

On Wed, 26 Mar 2025, Patrick Steinhardt wrote:

> On Tue, Mar 25, 2025 at 11:32:11PM +0000, Johannes Schindelin via GitGitGadget wrote:
> > diff --git a/wildmatch.c b/wildmatch.c
> > index 8ea29141bd7..ce8108a6d57 100644
> > --- a/wildmatch.c
> > +++ b/wildmatch.c
> > @@ -268,7 +268,7 @@ static int dowild(const uchar *p, const uchar *text, unsigned int flags)
> >  					p_ch = 0; /* This makes "prev_ch" get set to 0. */
> >  				} else if (t_ch == p_ch)
> >  					matched = 1;
> > -			} while (prev_ch = p_ch, (p_ch = *++p) != ']');
> > +			} while ((void)(prev_ch = p_ch), (p_ch = *++p) != ']');
> >  			if (matched == negated ||
> >  			    ((flags & WM_PATHNAME) && t_ch == '/'))
> >  				return WM_NOMATCH;
>
> In this case I agree that it makes sense to not introduce curly braces
> for brevity.

I should probably have mentioned that this patch took the longest to write
of the entire patch series, by far. Not because of the changed code, that
was easy. No, when I wrote the commit message and spotted the `continue`,
I did not want to leave it at that because the code around that `continue`
looks... well, let's just say that it could be rewritten for clarity.

In fact, when I looked at the following part, I was immediately
_convinced_ that it is incorrect, and had to work very hard to understand
why it works correctly, even going so far as to single-step through a
couple of examples, e.g. `test-tool wildmatch wildmatch 'b' '[[:a-z]'`:

				} else if (p_ch == '[' && p[1] == ':') {
					const uchar *s;
					int i;
					for (s = p += 2; (p_ch = *p) && p_ch != ']'; p++) {} /*SHARED ITERATOR*/
					if (!p_ch)
						return WM_ABORT_ALL;
					i = p - s - 1;
					if (i < 0 || p[-1] != ':') {
						/* Didn't find ":]", so treat like a normal set. */
						p = s - 2;
						p_ch = '[';
						if (t_ch == p_ch)
							matched = 1;
						continue;
					}

For context, here is a link:
https://gitlab.com/gitlab-org/git/-/blob/v2.49.0/wildmatch.c?ref_type=tags#L213-227.
At this point, `t_ch` is the current character in the text to match;
`p_ch` (and `*p`) is the current character in the _pattern_, and it is
_inside_ a `[...]` character range, and it wants to match a character
class of the form `[:alnum:]` but also allow for a regular character range
that starts by including the colon. And that latter scenario, where it is
_not_ a special character class, is what this `continue` is all about.

What threw me was that `t_ch == p_ch` condition _directly_ after assigning
`p_ch = '['`. It is still a pattern I would always immediately suspect to
be a bug: Why not compare `t_ch == '['` instead, which would be much more
obvious?

You will also note that the value of `i` is only used in the condition,
and it is basically used to determine whether the the colon was
immediately followed by the closing bracket or not, which could be
rewritten to be a lot more obvious.

So what does the `continue` do here? It skips back to the outer loop,
continuing with the `:` as next pattern character in the character range,
and for that it is crucial that the `p_ch` be set to the opening bracket
and `p` is rewound _just_ so that the assignments in the loop condition
can set things up for the next loop iteration (still within that `case
'['`) not to be thrown by that `[:`.

I probably did a terrible job explaining why the code works as intended,
but I'll just chalk it up to the contortions my brain had to exercise to
understand that code.

But all that's safely outside the scope of the question whether to use a
comma operator or not ;-)

Ciao,
Johannes

@@ -40,6 +40,10 @@ DEVELOPER_CFLAGS += -Wvla
DEVELOPER_CFLAGS += -Wwrite-strings
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Tue, Mar 25, 2025 at 11:32:13PM +0000, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <[email protected]>
> 
> When compiling Git using `clang`, the `-Wcomma` option can be used to
> warn about code using the comma operator (because it is typically
> unintentional and wants to use the semicolon instead).
> 
> Signed-off-by: Johannes Schindelin <[email protected]>
> ---
>  config.mak.dev | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/config.mak.dev b/config.mak.dev
> index 0fd8cc4d355..31423638169 100644
> --- a/config.mak.dev
> +++ b/config.mak.dev
> @@ -40,6 +40,10 @@ DEVELOPER_CFLAGS += -Wvla
>  DEVELOPER_CFLAGS += -Wwrite-strings
>  DEVELOPER_CFLAGS += -fno-common
>  
> +ifneq ($(filter clang9,$(COMPILER_FEATURES)),)
> +DEVELOPER_CFLAGS += -Wcomma
> +endif
> +
>  ifneq ($(filter clang4,$(COMPILER_FEATURES)),)
>  DEVELOPER_CFLAGS += -Wtautological-constant-out-of-range-compare
>  endif

Let's squash the below diff into this commit. The loop already makes
sure that the compiler supports the flag, so there is no need to special
case Clang.

Patrick

diff --git a/meson.build b/meson.build
index dd231b669b6..a7658d62ea0 100644
--- a/meson.build
+++ b/meson.build
@@ -717,6 +717,7 @@ libgit_dependencies = [ ]
 # Makefile.
 if get_option('warning_level') in ['2','3', 'everything'] and compiler.get_argument_syntax() == 'gcc'
   foreach cflag : [
+    '-Wcomma',
     '-Wdeclaration-after-statement',
     '-Wformat-security',
     '-Wold-style-definition',

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi Patrick,

On Wed, 26 Mar 2025, Patrick Steinhardt wrote:

> On Tue, Mar 25, 2025 at 11:32:13PM +0000, Johannes Schindelin via GitGitGadget wrote:
> > From: Johannes Schindelin <[email protected]>
> >
> > When compiling Git using `clang`, the `-Wcomma` option can be used to
> > warn about code using the comma operator (because it is typically
> > unintentional and wants to use the semicolon instead).
> >
> > Signed-off-by: Johannes Schindelin <[email protected]>
> > ---
> >  config.mak.dev | 4 ++++
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/config.mak.dev b/config.mak.dev
> > index 0fd8cc4d355..31423638169 100644
> > --- a/config.mak.dev
> > +++ b/config.mak.dev
> > @@ -40,6 +40,10 @@ DEVELOPER_CFLAGS += -Wvla
> >  DEVELOPER_CFLAGS += -Wwrite-strings
> >  DEVELOPER_CFLAGS += -fno-common
> >
> > +ifneq ($(filter clang9,$(COMPILER_FEATURES)),)
> > +DEVELOPER_CFLAGS += -Wcomma
> > +endif
> > +
> >  ifneq ($(filter clang4,$(COMPILER_FEATURES)),)
> >  DEVELOPER_CFLAGS += -Wtautological-constant-out-of-range-compare
> >  endif
>
> Let's squash the below diff into this commit. The loop already makes
> sure that the compiler supports the flag, so there is no need to special
> case Clang.

Okay, will do.

For my curiosity...

> diff --git a/meson.build b/meson.build
> index dd231b669b6..a7658d62ea0 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -717,6 +717,7 @@ libgit_dependencies = [ ]
>  # Makefile.
>  if get_option('warning_level') in ['2','3', 'everything'] and compiler.get_argument_syntax() == 'gcc'

This `get_argument_syntax() == 'gcc'` condition catches `clang`? What
other compilers that aren't GCC does it catch?

Ciao,
Johannes

>    foreach cflag : [
> +    '-Wcomma',
>      '-Wdeclaration-after-statement',
>      '-Wformat-security',
>      '-Wold-style-definition',
>

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Wed, Mar 26, 2025 at 08:50:24AM +0100, Johannes Schindelin wrote:
> On Wed, 26 Mar 2025, Patrick Steinhardt wrote:
> > diff --git a/meson.build b/meson.build
> > index dd231b669b6..a7658d62ea0 100644
> > --- a/meson.build
> > +++ b/meson.build
> > @@ -717,6 +717,7 @@ libgit_dependencies = [ ]
> >  # Makefile.
> >  if get_option('warning_level') in ['2','3', 'everything'] and compiler.get_argument_syntax() == 'gcc'
> 
> This `get_argument_syntax() == 'gcc'` condition catches `clang`? What
> other compilers that aren't GCC does it catch?

Yes, it does catch Clang as well as the Intel compiler. The list of
compilers and their respective syntax can be found at [1]

Patrick

[1]: https://mesonbuild.com/Reference-tables.html#compiler-ids

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi Patrick,

On Wed, 26 Mar 2025, Patrick Steinhardt wrote:

> On Wed, Mar 26, 2025 at 08:50:24AM +0100, Johannes Schindelin wrote:
> > On Wed, 26 Mar 2025, Patrick Steinhardt wrote:
> > > diff --git a/meson.build b/meson.build
> > > index dd231b669b6..a7658d62ea0 100644
> > > --- a/meson.build
> > > +++ b/meson.build
> > > @@ -717,6 +717,7 @@ libgit_dependencies = [ ]
> > >  # Makefile.
> > >  if get_option('warning_level') in ['2','3', 'everything'] and compiler.get_argument_syntax() == 'gcc'
> >
> > This `get_argument_syntax() == 'gcc'` condition catches `clang`? What
> > other compilers that aren't GCC does it catch?
>
> Yes, it does catch Clang as well as the Intel compiler. The list of
> compilers and their respective syntax can be found at [1]
>
> [1]: https://mesonbuild.com/Reference-tables.html#compiler-ids

Thank you for this valuable reference!

Ciao,
Johannes

Copy link

gitgitgadget bot commented Mar 26, 2025

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Tue, Mar 25, 2025 at 11:32:04PM +0000, Johannes Schindelin via GitGitGadget wrote:
> The comma operator
> [https://en.cppreference.com/w/c/language/operator_other#Comma_operator] is
> rarely used in C anymore, and typically indicates a typo. Just like in these
> instances, where a semicolon was meant to be used, as there is no need to
> discard the first statement's result here.
> 
> Changes since v1:
> 
>  * Use -Wcomma when compiling with clang and with DEVELOPER=1.
>  * Address the remaining instances pointed out by clang (and by Phillip).

Thanks for all of these fixes!

Patrick

Copy link

gitgitgadget bot commented Mar 26, 2025

On the Git mailing list, Taylor Blau wrote (reply to this):

On Tue, Mar 25, 2025 at 03:12:21PM +0100, Johannes Schindelin wrote:
> Hi Philip,
>
> On Tue, 25 Mar 2025, Philip Oakley wrote:
>
> > On 25/03/2025 08:01, Johannes Schindelin via GitGitGadget wrote:
> > > The comma operator
> > > [https://en.cppreference.com/w/c/language/operator_other#Comma_operator] is
> > > rarely used in C anymore, and typically indicates a typo. Just like in these
> > > instances, where a semicolon was meant to be used, as there is no need to
> > > discard the first statement's result here.
> >
> > Minor aside: How were these 'discovered'?
>
> I am working on a GitHub workflow that uses CodeQL to find such issues,
> that's how I found them. (I also worked with the CodeQL team to get this
> query added, way back when I was still working at GitHub.)

Neat. I will be curious to see how the results compare/contrast to what
those of us who run Coverity get.

Thanks,
Taylor

Copy link

gitgitgadget bot commented Mar 26, 2025

User Taylor Blau <[email protected]> has been added to the cc: list.

@@ -9,7 +9,7 @@ CC="$*"
#
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Jeff King wrote (reply to this):

On Tue, Mar 25, 2025 at 11:32:14PM +0000, Johannes Schindelin via GitGitGadget wrote:

> In my setup, clang finds `/usr/local/cuda` and hence the output of
> `clang -v` ends with this line:
> 
> 	Found CUDA installation: /usr/local/cuda, version
> 
> This confuses the `detect-compiler` script because it matches _all_
> lines that contain the needle "version" surrounded by spaces. As a
> consequence, the `get_family` function returns two lines: "Ubuntu clang"
> and above-mentioned line, which the `case` statement does not handle
> well and hence reports "unknown compiler family" instead of the expected
> set of "clang14", "clang13", ..., "clang1" output.
> 
> Let's unconfuse the script by letting it parse the first matching line
> and ignore the rest.

Makes sense. I wondered if this:

>  get_version_line() {
> -	LANG=C LC_ALL=C $CC -v 2>&1 | grep ' version '
> +	LANG=C LC_ALL=C $CC -v 2>&1 | sed -n '/ version /{p;q}'

might be more readable with "grep -m1", but it looks like "-m" is not in
POSIX. So what you wrote is probably safer.

-Peff

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Eric Sunshine wrote (reply to this):

On Wed, Mar 26, 2025 at 1:44 PM Jeff King <[email protected]> wrote:
> On Tue, Mar 25, 2025 at 11:32:14PM +0000, Johannes Schindelin via GitGitGadget wrote:
> > Let's unconfuse the script by letting it parse the first matching line
> > and ignore the rest.
>
> Makes sense. I wondered if this:
>
> >  get_version_line() {
> > -     LANG=C LC_ALL=C $CC -v 2>&1 | grep ' version '
> > +     LANG=C LC_ALL=C $CC -v 2>&1 | sed -n '/ version /{p;q}'
>
> might be more readable with "grep -m1", but it looks like "-m" is not in
> POSIX. So what you wrote is probably safer.

It's probably an indication that I've done too much `sed` programming,
but I find Dscho's version more obvious. That aside, your response
made me take a closer look at what Dscho wrote and I noticed that it
is syntactically flawed, at least for BSD-lineage `sed`. Testing on
macOS reveals that this is indeed so:

    % LANG=C LC_ALL=C cc -v 2>&1 | sed -n '/ version /{p;q}'
    sed: 1: "/ version /{p;q}": extra characters at the end of q command

The problem is that the `q` function takes no arguments, but
BSD-lineage `sed` thinks that the closing `}` is an argument rather
than a terminator. Fixing this requires inserting a terminator after
`q`, which will be either a newline character or a semicolon. So, the
correct form is:

    sed -n '/ version /{p;q;}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Jeff King wrote (reply to this):

On Wed, Mar 26, 2025 at 02:07:10PM -0400, Eric Sunshine wrote:

> On Wed, Mar 26, 2025 at 1:44 PM Jeff King <[email protected]> wrote:
> > On Tue, Mar 25, 2025 at 11:32:14PM +0000, Johannes Schindelin via GitGitGadget wrote:
> > > Let's unconfuse the script by letting it parse the first matching line
> > > and ignore the rest.
> >
> > Makes sense. I wondered if this:
> >
> > >  get_version_line() {
> > > -     LANG=C LC_ALL=C $CC -v 2>&1 | grep ' version '
> > > +     LANG=C LC_ALL=C $CC -v 2>&1 | sed -n '/ version /{p;q}'
> >
> > might be more readable with "grep -m1", but it looks like "-m" is not in
> > POSIX. So what you wrote is probably safer.
> 
> It's probably an indication that I've done too much `sed` programming,
> but I find Dscho's version more obvious. That aside, your response
> made me take a closer look at what Dscho wrote and I noticed that it
> is syntactically flawed, at least for BSD-lineage `sed`. Testing on
> macOS reveals that this is indeed so:
> 
>     % LANG=C LC_ALL=C cc -v 2>&1 | sed -n '/ version /{p;q}'
>     sed: 1: "/ version /{p;q}": extra characters at the end of q command
> 
> The problem is that the `q` function takes no arguments, but
> BSD-lineage `sed` thinks that the closing `}` is an argument rather
> than a terminator. Fixing this requires inserting a terminator after
> `q`, which will be either a newline character or a semicolon. So, the
> correct form is:
> 
>     sed -n '/ version /{p;q;}

Heh, I think it was the braces and semicolons that made my spider-sense
tingle, probably because I've been bitten by those subtleties in the
past.

I think just "/foo/p;q" works on GNU sed, but no idea if it does
elsewhere. What you wrote seems the safest.

-Peff

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Eric Sunshine wrote (reply to this):

On Thu, Mar 27, 2025 at 1:14 AM Jeff King <[email protected]> wrote:
> On Wed, Mar 26, 2025 at 02:07:10PM -0400, Eric Sunshine wrote:
> > It's probably an indication that I've done too much `sed` programming,
> > but I find Dscho's version more obvious. That aside, your response
> > made me take a closer look at what Dscho wrote and I noticed that it
> > is syntactically flawed, at least for BSD-lineage `sed`. Testing on
> > macOS reveals that this is indeed so:
> >
> >     % LANG=C LC_ALL=C cc -v 2>&1 | sed -n '/ version /{p;q}'
> >     sed: 1: "/ version /{p;q}": extra characters at the end of q command
> >
> > The problem is that the `q` function takes no arguments, but
> > BSD-lineage `sed` thinks that the closing `}` is an argument rather
> > than a terminator. Fixing this requires inserting a terminator after
> > `q`, which will be either a newline character or a semicolon. So, the
> > correct form is:
> >
> >     sed -n '/ version /{p;q;}
>
> Heh, I think it was the braces and semicolons that made my spider-sense
> tingle, probably because I've been bitten by those subtleties in the
> past.
>
> I think just "/foo/p;q" works on GNU sed, but no idea if it does
> elsewhere. What you wrote seems the safest.

That's not quite the same, though. The patternless `q` will cause
`sed` to terminate upon reading the first line of input, not upon the
first line which contains " version ". This matters, for instance, if
the first line output by `$CC -v` is not the version string (i.e. it
might be a copyright notice).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Jeff King wrote (reply to this):

On Thu, Mar 27, 2025 at 01:21:03AM -0400, Eric Sunshine wrote:

> > I think just "/foo/p;q" works on GNU sed, but no idea if it does
> > elsewhere. What you wrote seems the safest.
> 
> That's not quite the same, though. The patternless `q` will cause
> `sed` to terminate upon reading the first line of input, not upon the
> first line which contains " version ". This matters, for instance, if
> the first line output by `$CC -v` is not the version string (i.e. it
> might be a copyright notice).

Oh, duh, you're right. I even tested to make sure were still quitting
after matching, but of course that is because we were quitting even when
_not_ matching.

For some reason I also thought /foo/pq would work, but it doesn't. I
wonder if it used to under some implementations, or if I'm simply going
senile.

(The point still remains that what you wrote is the correct thing).

-Peff

Copy link

gitgitgadget bot commented Mar 26, 2025

On the Git mailing list, Jeff King wrote (reply to this):

On Tue, Mar 25, 2025 at 11:32:04PM +0000, Johannes Schindelin via GitGitGadget wrote:

> The comma operator
> [https://en.cppreference.com/w/c/language/operator_other#Comma_operator] is
> rarely used in C anymore, and typically indicates a typo. Just like in these
> instances, where a semicolon was meant to be used, as there is no need to
> discard the first statement's result here.
> 
> Changes since v1:
> 
>  * Use -Wcomma when compiling with clang and with DEVELOPER=1.
>  * Address the remaining instances pointed out by clang (and by Phillip).

Thanks for fixing these. I checked the diff against the quick-and-dirty
patch I posted earlier in the thread, and your resolutions for the
"easy" cases look good (though like others, I'd prefer switching to
semicolons for the one in diff-delta.c).

For the harder cases inside while() conditionals, the rewrites all look
correct to me. I do think that getting rid of the commas often makes the
result easier to read, but the discussion on wildmatch shows that it's
easy to get the transformation wrong. So I'd be happy enough slapping a
"(void)" on that one and moving on with our lives. The goal is not to
prettify that code (which was not even written for Git in the first
place) but to silence -Wcomma false positives so that we can find the
actual problems.

-Peff

Copy link

gitgitgadget bot commented Mar 26, 2025

User Eric Sunshine <[email protected]> has been added to the cc: list.

Copy link

gitgitgadget bot commented Mar 26, 2025

On the Git mailing list, Taylor Blau wrote (reply to this):

On Tue, Mar 25, 2025 at 03:13:47PM +0100, Johannes Schindelin wrote:
> > It would be great if there was a compiler warning we could enable for
> > cases where the operator likely isn't intentional. But I couldn't find
> > any, unfortunately.
>
> I was not actually planning on adding the CodeQL workflow to git/git,
> seeing as its CI is already taking way too much CPU time for my liking.
> But in `microsoft/git`, I am kind of required to, so we'll catch those
> issues there.

Heh. I should have read this email from you before replying to the last
one ;-).

I would actually be interested in having CodeQL support "ship" for
copies of git.git that are hosted on GitHub. I wonder if we could do a
similar trick as in a56b6230d0 (ci: add a GitHub workflow to submit
Coverity scans, 2023-09-25), where the ENABLE_COVERITY_SCAN_FOR_BRANCHES
variable controls whether or not the job actually executes.

Thanks,
Taylor

@@ -1231,9 +1231,10 @@ re_node_set_merge (re_node_set *dest, const re_node_set *src)
for (sbase = dest->nelem + 2 * src->nelem,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Taylor Blau wrote (reply to this):

On Tue, Mar 25, 2025 at 11:32:12PM +0000, Johannes Schindelin via GitGitGadget wrote:
> diff --git a/compat/regex/regex_internal.c b/compat/regex/regex_internal.c
> index ec5cc5d2dd1..7672583bf7e 100644
> --- a/compat/regex/regex_internal.c
> +++ b/compat/regex/regex_internal.c
> @@ -1231,9 +1231,10 @@ re_node_set_merge (re_node_set *dest, const re_node_set *src)
>    for (sbase = dest->nelem + 2 * src->nelem,
>         is = src->nelem - 1, id = dest->nelem - 1; is >= 0 && id >= 0; )
>      {
> -      if (dest->elems[id] == src->elems[is])
> -	is--, id--;
> -      else if (dest->elems[id] < src->elems[is])
> +      if (dest->elems[id] == src->elems[is]) {
> +	is--;
> +	id--;
> +      } else if (dest->elems[id] < src->elems[is])

Should the other arms of this conditional have matching curly-braces?

Thanks,
Taylor

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi Taylor,

On Wed, 26 Mar 2025, Taylor Blau wrote:

> On Tue, Mar 25, 2025 at 11:32:12PM +0000, Johannes Schindelin via GitGitGadget wrote:
> > diff --git a/compat/regex/regex_internal.c b/compat/regex/regex_internal.c
> > index ec5cc5d2dd1..7672583bf7e 100644
> > --- a/compat/regex/regex_internal.c
> > +++ b/compat/regex/regex_internal.c
> > @@ -1231,9 +1231,10 @@ re_node_set_merge (re_node_set *dest, const re_node_set *src)
> >    for (sbase = dest->nelem + 2 * src->nelem,
> >         is = src->nelem - 1, id = dest->nelem - 1; is >= 0 && id >= 0; )
> >      {
> > -      if (dest->elems[id] == src->elems[is])
> > -	is--, id--;
> > -      else if (dest->elems[id] < src->elems[is])
> > +      if (dest->elems[id] == src->elems[is]) {
> > +	is--;
> > +	id--;
> > +      } else if (dest->elems[id] < src->elems[is])
>
> Should the other arms of this conditional have matching curly-braces?

No. Have a look around in that file, that's not the coding convention.

However, a valid concern is that I used Git's convention for the curly
brackets (appending the open bracket to the `if` line), which is _not_ the
convention used in this file, which uses GNU conventions instead. I will
address that concern in the next iteration.

Ciao,
Johannes

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Taylor Blau wrote (reply to this):

On Thu, Mar 27, 2025 at 11:29:16AM +0100, Johannes Schindelin wrote:
> Hi Taylor,
>
> On Wed, 26 Mar 2025, Taylor Blau wrote:
>
> > On Tue, Mar 25, 2025 at 11:32:12PM +0000, Johannes Schindelin via GitGitGadget wrote:
> > > diff --git a/compat/regex/regex_internal.c b/compat/regex/regex_internal.c
> > > index ec5cc5d2dd1..7672583bf7e 100644
> > > --- a/compat/regex/regex_internal.c
> > > +++ b/compat/regex/regex_internal.c
> > > @@ -1231,9 +1231,10 @@ re_node_set_merge (re_node_set *dest, const re_node_set *src)
> > >    for (sbase = dest->nelem + 2 * src->nelem,
> > >         is = src->nelem - 1, id = dest->nelem - 1; is >= 0 && id >= 0; )
> > >      {
> > > -      if (dest->elems[id] == src->elems[is])
> > > -	is--, id--;
> > > -      else if (dest->elems[id] < src->elems[is])
> > > +      if (dest->elems[id] == src->elems[is]) {
> > > +	is--;
> > > +	id--;
> > > +      } else if (dest->elems[id] < src->elems[is])
> >
> > Should the other arms of this conditional have matching curly-braces?
>
> No. Have a look around in that file, that's not the coding convention.

I was just about to respond that even though it breaks the convention,
that we should encourage good hygeine by ensuring new code follows the
CodingGuidelines, even if it looks wonky in the context of the rest of
the file.

But this is compat/regex code, which clearly does not need to follow the
convention. Sorry about that!

Thanks,
Taylor

Copy link

gitgitgadget bot commented Mar 27, 2025

User Chris Torek <[email protected]> has been added to the cc: list.

dscho added 5 commits March 27, 2025 11:32
The comma operator is a somewhat obscure C feature that is often used by
mistake and can even cause unintentional code flow. That is why the
`-Wcomma` option of clang was introduced: To identify unintentional uses
of the comma operator.

Intentional uses include situations where one wants to avoid curly
brackets around multiple statements that need to be guarded by a
condition. This is the case here, as the repetitive nature of the
statements is easier to see for a human reader this way. At least in my
opinion.

However, opinions on this differ wildly, take 10 people and you have 10
different preferences.

On the Git mailing list, it seems that the consensus is to use the long
form instead, so let's do just that.

Suggested-by: Phillip Wood <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
The comma operator is a somewhat obscure C feature that is often used by
mistake and can even cause unintentional code flow. That is why the
`-Wcomma` option of clang was introduced: To identify unintentional uses
of the comma operator.

In this instance, the usage is intentional because it allows storing the
value of the current character as `prev_ch` before making the next
character the current one, all of which happens in the loop condition
that lets the loop stop at a closing bracket.

However, it is hard to read.

The chosen alternative to using the comma operator is to move those
assignments from the condition into the loop body; In this particular
case that requires special care because the loop body contains a
`continue` for the case where a character class is found that starts
with `[:` but does not end in `:]` (and the assignments should occur
even when that code path is taken), which needs to be turned into a
`goto`.

Helped-by: Phillip Wood <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
The comma operator is a somewhat obscure C feature that is often used by
mistake and can even cause unintentional code flow. That is why the
`-Wcomma` option of clang was introduced: To identify unintentional uses
of the comma operator.

In the `compat/regex/` code, the comma operator is used twice, once to
avoid surrounding two conditional statements with curly brackets, the
other one to increment two counters simultaneously in a `do ... while`
condition.

The first one is replaced with a proper conditional block, surrounded by
curly brackets.

The second one would be harder to replace because the loop contains two
`continue`s. Therefore, the second one is marked as intentional by
casting the value-to-discard to `void`.

Signed-off-by: Johannes Schindelin <[email protected]>
When compiling Git using `clang`, the `-Wcomma` option can be used to
warn about code using the comma operator (because it is typically
unintentional and wants to use the semicolon instead).

Helped-by: Patrick Steinhardt <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
In my setup, clang finds `/usr/local/cuda` and hence the output of
`clang -v` ends with this line:

	Found CUDA installation: /usr/local/cuda, version

This confuses the `detect-compiler` script because it matches _all_
lines that contain the needle "version" surrounded by spaces. As a
consequence, the `get_family` function returns two lines: "Ubuntu clang"
and above-mentioned line, which the `case` statement does not handle
well and hence reports "unknown compiler family" instead of the expected
set of "clang14", "clang13", ..., "clang1" output.

Let's unconfuse the script by letting it parse the first matching line
and ignore the rest.

Helped-by: Eric Sunshine <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
@dscho
Copy link
Member Author

dscho commented Mar 27, 2025

/submit

Copy link

gitgitgadget bot commented Mar 27, 2025

Submitted as [email protected]

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1889/dscho/comma-operator-v3

To fetch this version to local tag pr-1889/dscho/comma-operator-v3:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1889/dscho/comma-operator-v3

Copy link

gitgitgadget bot commented Mar 27, 2025

On the Git mailing list, Phillip Wood wrote (reply to this):

Hi Johannes

On 27/03/2025 11:52, Johannes Schindelin via GitGitGadget wrote:
> > Changes since v2:
> >   * Made the sed construct in detect-compiler portable (thanks, Eric
>     Sunshine!)
>   * The majority of the feedback disagreed with the more compact format in
>     diff-delta.c, so I changed it to the long format (thanks, Phillip Wood!)
>   * The more succinct and safer, but less readable, cast in the loop
>     condition of the dowild() function was replaced with the goto-based
>     alternative I had mentioned as a possibility in the commit message
>     (thanks, Phillip Wood!)
>   * I adjusted the style of my compat/regex/ patch to the surrounding code's.
>   * The -Wcomma option is now used in Meson-based clang builds, too (thanks,
>     Patrick Steinhardt!)

The range-diff below looks good to me, thanks for making our code base clearer.

Best Wishes

Phillip

> Range-diff vs v2:
> >    1:  913c7a0d296 =  1:  913c7a0d296 remote-curl: avoid using the comma operator unnecessarily
>    2:  37ff88b8275 =  2:  37ff88b8275 rebase: avoid using the comma operator unnecessarily
>    3:  f601f4e74a5 =  3:  f601f4e74a5 kwset: avoid using the comma operator unnecessarily
>    4:  f60ebe376e1 =  4:  f60ebe376e1 clar: avoid using the comma operator unnecessarily
>    5:  7239078413f =  5:  7239078413f xdiff: avoid using the comma operator unnecessarily
>    6:  5e0e8325620 !  6:  045d695d73e diff-delta: explicitly mark intentional use of the comma operator
>       @@ Metadata
>        Author: Johannes Schindelin <[email protected]>
>        >         ## Commit message ##
>       -    diff-delta: explicitly mark intentional use of the comma operator
>       +    diff-delta: avoid using the comma operator
>        >            The comma operator is a somewhat obscure C feature that is often used by
>            mistake and can even cause unintentional code flow. That is why the
>       @@ Commit message
>            Intentional uses include situations where one wants to avoid curly
>            brackets around multiple statements that need to be guarded by a
>            condition. This is the case here, as the repetitive nature of the
>       -    statements is easier to see for a human reader this way.
>       +    statements is easier to see for a human reader this way. At least in my
>       +    opinion.
>        >       -    To mark this usage as intentional, the return value of the statement
>       -    before the comma needs to be cast to `void`, which we do here.
>       +    However, opinions on this differ wildly, take 10 people and you have 10
>       +    different preferences.
>        >       +    On the Git mailing list, it seems that the consensus is to use the long
>       +    form instead, so let's do just that.
>       +
>       +    Suggested-by: Phillip Wood <[email protected]>
>            Signed-off-by: Johannes Schindelin <[email protected]>
>        >         ## diff-delta.c ##
>        @@ diff-delta.c: create_delta(const struct delta_index *index,
>       + 			op = out + outpos++;
>         			i = 0x80;
>         >       - 			if (moff & 0x000000ff)
>       +-			if (moff & 0x000000ff)
>        -				out[outpos++] = moff >> 0,  i |= 0x01;
>       -+				(void)(out[outpos++] = moff >> 0),  i |= 0x01;
>       - 			if (moff & 0x0000ff00)
>       +-			if (moff & 0x0000ff00)
>        -				out[outpos++] = moff >> 8,  i |= 0x02;
>       -+				(void)(out[outpos++] = moff >> 8),  i |= 0x02;
>       - 			if (moff & 0x00ff0000)
>       +-			if (moff & 0x00ff0000)
>        -				out[outpos++] = moff >> 16, i |= 0x04;
>       -+				(void)(out[outpos++] = moff >> 16), i |= 0x04;
>       - 			if (moff & 0xff000000)
>       +-			if (moff & 0xff000000)
>        -				out[outpos++] = moff >> 24, i |= 0x08;
>       -+				(void)(out[outpos++] = moff >> 24), i |= 0x08;
>       -
>       - 			if (msize & 0x00ff)
>       +-
>       +-			if (msize & 0x00ff)
>        -				out[outpos++] = msize >> 0, i |= 0x10;
>       -+				(void)(out[outpos++] = msize >> 0), i |= 0x10;
>       - 			if (msize & 0xff00)
>       +-			if (msize & 0xff00)
>        -				out[outpos++] = msize >> 8, i |= 0x20;
>       -+				(void)(out[outpos++] = msize >> 8), i |= 0x20;
>       ++			if (moff & 0x000000ff) {
>       ++				out[outpos++] = moff >> 0;
>       ++				i |= 0x01;
>       ++			}
>       ++			if (moff & 0x0000ff00) {
>       ++				out[outpos++] = moff >> 8;
>       ++				i |= 0x02;
>       ++			}
>       ++			if (moff & 0x00ff0000) {
>       ++				out[outpos++] = moff >> 16;
>       ++				i |= 0x04;
>       ++			}
>       ++			if (moff & 0xff000000) {
>       ++				out[outpos++] = moff >> 24;
>       ++				i |= 0x08;
>       ++			}
>       ++
>       ++			if (msize & 0x00ff) {
>       ++				out[outpos++] = msize >> 0;
>       ++				i |= 0x10;
>       ++			}
>       ++			if (msize & 0xff00) {
>       ++				out[outpos++] = msize >> 8;
>       ++				i |= 0x20;
>       ++			}
>         >         			*op = i;
>         >    7:  9a6de12b807 !  7:  1d0ce59cb68 wildmatch: explicitly mark intentional use of the comma operator
>       @@ Metadata
>        Author: Johannes Schindelin <[email protected]>
>        >         ## Commit message ##
>       -    wildmatch: explicitly mark intentional use of the comma operator
>       +    wildmatch: avoid using of the comma operator
>        >            The comma operator is a somewhat obscure C feature that is often used by
>            mistake and can even cause unintentional code flow. That is why the
>            `-Wcomma` option of clang was introduced: To identify unintentional uses
>            of the comma operator.
>        >       -    To mark such a usage as intentional, the value needs to be cast to
>       -    `void`, which we do here.
>       -
>            In this instance, the usage is intentional because it allows storing the
>            value of the current character as `prev_ch` before making the next
>            character the current one, all of which happens in the loop condition
>            that lets the loop stop at a closing bracket.
>        >       -    The alternative to using the comma operator would be to move those
>       +    However, it is hard to read.
>       +
>       +    The chosen alternative to using the comma operator is to move those
>            assignments from the condition into the loop body; In this particular
>       -    case that would require the assignments to either be duplicated or to
>       -    introduce and use a `goto` target before the assignments, though,
>       -    because the loop body contains a `continue` for the case where a
>       -    character class is found that starts with `[:` but does not end in `:]`
>       -    (and the assignments should occur even when that code path is taken).
>       +    case that requires special care because the loop body contains a
>       +    `continue` for the case where a character class is found that starts
>       +    with `[:` but does not end in `:]` (and the assignments should occur
>       +    even when that code path is taken), which needs to be turned into a
>       +    `goto`.
>        >       +    Helped-by: Phillip Wood <[email protected]>
>            Signed-off-by: Johannes Schindelin <[email protected]>
>        >         ## wildmatch.c ##
>       +@@ wildmatch.c: static int dowild(const uchar *p, const uchar *text, unsigned int flags)
>       + 						p_ch = '[';
>       + 						if (t_ch == p_ch)
>       + 							matched = 1;
>       +-						continue;
>       ++						goto next;
>       + 					}
>       + 					if (CC_EQ(s,i, "alnum")) {
>       + 						if (ISALNUM(t_ch))
>        @@ wildmatch.c: static int dowild(const uchar *p, const uchar *text, unsigned int flags)
>         					p_ch = 0; /* This makes "prev_ch" get set to 0. */
>         				} else if (t_ch == p_ch)
>         					matched = 1;
>        -			} while (prev_ch = p_ch, (p_ch = *++p) != ']');
>       -+			} while ((void)(prev_ch = p_ch), (p_ch = *++p) != ']');
>       ++next:
>       ++				prev_ch = p_ch;
>       ++				p_ch = *++p;
>       ++			} while (p_ch != ']');
>         			if (matched == negated ||
>         			    ((flags & WM_PATHNAME) && t_ch == '/'))
>         				return WM_NOMATCH;
>    8:  dc626f36df3 !  8:  b8405f3d237 compat/regex: explicitly mark intentional use of the comma operator
>       @@ Commit message
>        >         ## compat/regex/regex_internal.c ##
>        @@ compat/regex/regex_internal.c: re_node_set_merge (re_node_set *dest, const re_node_set *src)
>       -   for (sbase = dest->nelem + 2 * src->nelem,
>                is = src->nelem - 1, id = dest->nelem - 1; is >= 0 && id >= 0; )
>             {
>       --      if (dest->elems[id] == src->elems[is])
>       +       if (dest->elems[id] == src->elems[is])
>        -	is--, id--;
>       --      else if (dest->elems[id] < src->elems[is])
>       -+      if (dest->elems[id] == src->elems[is]) {
>       -+	is--;
>       -+	id--;
>       -+      } else if (dest->elems[id] < src->elems[is])
>       ++	{
>       ++	  is--;
>       ++	  id--;
>       ++	}
>       +       else if (dest->elems[id] < src->elems[is])
>         	dest->elems[--sbase] = src->elems[is--];
>               else /* if (dest->elems[id] > src->elems[is]) */
>       - 	--id;
>        >         ## compat/regex/regexec.c ##
>        @@ compat/regex/regexec.c: sift_states_bkref (const re_match_context_t *mctx, re_sift_context_t *sctx,
>    9:  91f86c3aba9 !  9:  6b6cd556465 clang: warn when the comma operator is used
>       @@ Commit message
>            warn about code using the comma operator (because it is typically
>            unintentional and wants to use the semicolon instead).
>        >       +    Helped-by: Patrick Steinhardt <[email protected]>
>            Signed-off-by: Johannes Schindelin <[email protected]>
>        >         ## config.mak.dev ##
>       @@ config.mak.dev: DEVELOPER_CFLAGS += -Wvla
>         ifneq ($(filter clang4,$(COMPILER_FEATURES)),)
>         DEVELOPER_CFLAGS += -Wtautological-constant-out-of-range-compare
>         endif
>       +
>       + ## meson.build ##
>       +@@ meson.build: libgit_dependencies = [ ]
>       + # Makefile.
>       + if get_option('warning_level') in ['2','3', 'everything'] and compiler.get_argument_syntax() == 'gcc'
>       +   foreach cflag : [
>       ++    '-Wcomma',
>       +     '-Wdeclaration-after-statement',
>       +     '-Wformat-security',
>       +     '-Wold-style-definition',
>   10:  2f6f31240fe ! 10:  77f1dcaca1c detect-compiler: detect clang even if it found CUDA
>       @@ Commit message
>            Let's unconfuse the script by letting it parse the first matching line
>            and ignore the rest.
>        >       +    Helped-by: Eric Sunshine <[email protected]>
>            Signed-off-by: Johannes Schindelin <[email protected]>
>        >         ## detect-compiler ##
>       @@ detect-compiler: CC="$*"
>         # FreeBSD clang version 3.4.1 (tags/RELEASE...)
>         get_version_line() {
>        -	LANG=C LC_ALL=C $CC -v 2>&1 | grep ' version '
>       -+	LANG=C LC_ALL=C $CC -v 2>&1 | sed -n '/ version /{p;q}'
>       ++	LANG=C LC_ALL=C $CC -v 2>&1 | sed -n '/ version /{p;q;}'
>         }
>         >         get_family() {
> 

Copy link

gitgitgadget bot commented Mar 28, 2025

This patch series was integrated into seen via git@38096ee.

@gitgitgadget gitgitgadget bot added the seen label Mar 28, 2025
Copy link

gitgitgadget bot commented Mar 29, 2025

On the Git mailing list, Junio C Hamano wrote (reply to this):

Phillip Wood <[email protected]> writes:

> Hi Johannes
>
> On 27/03/2025 11:52, Johannes Schindelin via GitGitGadget wrote:
>> Changes since v2:
>>   * Made the sed construct in detect-compiler portable (thanks, Eric
>>     Sunshine!)
>>   * The majority of the feedback disagreed with the more compact format in
>>     diff-delta.c, so I changed it to the long format (thanks, Phillip Wood!)
>>   * The more succinct and safer, but less readable, cast in the loop
>>     condition of the dowild() function was replaced with the goto-based
>>     alternative I had mentioned as a possibility in the commit message
>>     (thanks, Phillip Wood!)
>>   * I adjusted the style of my compat/regex/ patch to the surrounding code's.
>>   * The -Wcomma option is now used in Meson-based clang builds, too (thanks,
>>     Patrick Steinhardt!)
>
> The range-diff below looks good to me, thanks for making our code base
> clearer.
>
> Best Wishes
>
> Phillip

Yup, thanks Dscho, and all who gave valuable input to polish the
series.

Will queue.

Copy link

gitgitgadget bot commented Mar 29, 2025

This patch series was integrated into seen via git@4c66618.

Copy link

gitgitgadget bot commented Apr 1, 2025

This branch is now known as js/comma-semicolon-confusion.

Copy link

gitgitgadget bot commented Apr 1, 2025

This patch series was integrated into seen via git@014e931.

Copy link

gitgitgadget bot commented Apr 7, 2025

There was a status update in the "Cooking" section about the branch js/comma-semicolon-confusion on the Git mailing list:

Code clean-up.

Will merge to 'next'.
source: <[email protected]>

Copy link

gitgitgadget bot commented Apr 7, 2025

This patch series was integrated into seen via git@7289c9b.

Copy link

gitgitgadget bot commented Apr 7, 2025

This patch series was integrated into next via git@3c1f7f4.

@gitgitgadget gitgitgadget bot added the next label Apr 7, 2025
Copy link

gitgitgadget bot commented Apr 8, 2025

This patch series was integrated into seen via git@45dd469.

Copy link

gitgitgadget bot commented Apr 8, 2025

This patch series was integrated into seen via git@3c1f7f4.

Copy link

gitgitgadget bot commented Apr 8, 2025

This patch series was integrated into seen via git@b9c9aee.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant