Discussion:
[janitorial] leading whitespace: spaces XOR tabs
Add Reply
Olaf Meeuwissen
2017-07-12 12:33:09 UTC
Reply
Permalink
Raw Message
Hi all,

I just committed the last compiler warning fixes and made the Debian 9
builds "AWARE". Now any compiler warnings on all 4 Debian builds will
bomb the build in question and hence prevent the creation of a new
snapshot tarball.

I mentioned[1] that the plustek-pp backend's indenting defied me but
after some staring at the code I realized it was using a four spaces to
the tab convention. Convincing my editor to do the same made it a lot
easier to understand the intended behaviour but fixing the "mess" was
still a very delicate affair. I had to change the mixed use of spaces
and tabs used to indent to *exactly* match in order to silence compiler
warnings.

[1] https://lists.alioth.debian.org/pipermail/sane-devel/2017-June/035445.html

This whole exercise has made me look at the whole code base in a little
more detail and, quite frankly, the use of leading whitespace is a total
and utter mess. Some places are better of than others but on the whole
it's pretty pathetic.

# Just make your editor visually distinguish spaces and tabs and you'll
# see. I used Emacs' whitespace-mode (in its default configuration) and
# it "lit up the place" in a variety of colours.

So here's a few "rules" I'd like to apply in order to address this.
Each file

- uses either spaces or tabs for *all* of its leading whitespace
This is *not* on a per line basis, this applies to the *whole* file.
- if using tabs, these tabs may be followed by up to 7 spaces
This is to accommodate n-space indent policies (where n mod 8 != 0)
and assumes eight spaces to the tab.
- if using tabs, uses eight spaces to the tab
- if to be used by `make`, an initial whitespace character shall be
a tab, independent of whether its on a continuation line or not

Note, this says absolutely nothing about whitespace use after the first
non-whitespace. That can still be a mess.

The "rules" above are inspired by a combination of consistency, ease of
checking/fixing and giving developers some leeway in applying their own
preferences to their code.

Whether to use spaces or tabs for each file will be decided on the basis
of a count of tabs and spaces (and in case of doubt comparison with any
related files so as to keep some kind of consistency between them). The
criterion will be a majority of one over the other (taking into account
the number of spaces to a tab). So, with n spaces to the tab, the
larger value of n*number_of_leading_tabs and number_of_leading_spaces
will decide the leading whitespace policy for a file.

# Personally, I would much prefer to uses spaces everywhere the file
# format permits (with a minor execption for files used by make, see
# above) over the board.
# You may find the following blog post of interest ;-)
#
# https://stackoverflow.blog/2017/06/15/developers-use-spaces-make-money-use-tabs/

If anyone objects, I'm open to suggestions, including "Why bother? Just
use spaces!" ;-)

If I hear nothing, the tools/style-check.sh script will be modified to
implement these rules and can then be use to check for any "violations"
and (recursively) fix them.

Hope this helps,
--
Olaf Meeuwissen, LPIC-2 FSF Associate Member since 2004-01-27
GnuPG key: F84A2DD9/B3C0 2F47 EA19 64F4 9F13 F43E B8A4 A88A F84A 2DD9
Support Free Software https://my.fsf.org/donate
Join the Free Software Foundation https://my.fsf.org/join
--
sane-devel mailing list: sane-***@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/sane-devel
Unsubscribe: Send mail with subject "unsubscribe your_password"
to sane-devel-***@lists.alioth.debian.org
Gerhard Jäger
2017-07-12 13:35:36 UTC
Reply
Permalink
Raw Message
Hi Olaf,
Post by Olaf Meeuwissen
Hi all,
I just committed the last compiler warning fixes and made the Debian 9
builds "AWARE". Now any compiler warnings on all 4 Debian builds will
bomb the build in question and hence prevent the creation of a new
snapshot tarball.
I mentioned[1] that the plustek-pp backend's indenting defied me but
after some staring at the code I realized it was using a four spaces to
the tab convention. Convincing my editor to do the same made it a lot
easier to understand the intended behaviour but fixing the "mess" was
still a very delicate affair. I had to change the mixed use of spaces
and tabs used to indent to *exactly* match in order to silence compiler
warnings.
[1] https://lists.alioth.debian.org/pipermail/sane-devel/2017-June/035445.html
[...]

I obviously ignored you - sorry for that. And yes, I did follow at these old
days my own rules, not using tabs, but 4 spaces - in an inconsistent way.
Sorry - 'twas a long time ago ;)
Post by Olaf Meeuwissen
This whole exercise has made me look at the whole code base in a little
more detail and, quite frankly, the use of leading whitespace is a total
and utter mess. Some places are better of than others but on the whole
it's pretty pathetic.
# Just make your editor visually distinguish spaces and tabs and you'll
# see. I used Emacs' whitespace-mode (in its default configuration) and
# it "lit up the place" in a variety of colours.
So here's a few "rules" I'd like to apply in order to address this.
Each file
[...]
Post by Olaf Meeuwissen
# Personally, I would much prefer to uses spaces everywhere the file
# format permits (with a minor execption for files used by make, see
# above) over the board.
# You may find the following blog post of interest ;-)
#
# https://stackoverflow.blog/2017/06/15/developers-use-spaces-make-money-use-tabs/
If anyone objects, I'm open to suggestions, including "Why bother? Just
use spaces!" ;-)
If I hear nothing, the tools/style-check.sh script will be modified to
implement these rules and can then be use to check for any "violations"
and (recursively) fix them.
Are you sure you want to rework the whole code base? Is it a good idea?
Well maybe for unmaintained backends, but what about the others? In fact
I have here some long pending patch (64 bit awareness) that most probably
no longer apply.

Just my two cents,
Gerhard

BTE: Thanks for caring anyhow.
--
sane-devel mailing list: sane-***@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/sane-devel
Unsubscribe: Send mail with subject "unsubscribe your_password"
to sane-devel-***@lists.alioth.debian.org
m. allan noah
2017-07-12 14:23:23 UTC
Reply
Permalink
Raw Message
1. I routinely use git blame to find out when I changed some line of
code. A massive whitespace commit would wreck that. Yes, there are
other ways to get that info after such a cleanup, but I'm lazy :)

2. I've read a great deal of other people's code over the years, and I
am generally stumped by their logic and lack of comments. Whitespace
is rarely a concern, even when they used a weird layout.

3. There are patches floating around in private repos that may not
apply cleanly after such a change.

Given those, I'm inclined to leave maintained backends untouched. For
unmaintained backends, it certainly makes sense to do such
reformatting when a bugfix is done. Doing it beforehand is
questionable, I don't feel strongly either way. I suppose cleaning up
the unmaintained backends makes it slightly easier for a new
maintainer to step into the code.

allan
Post by Gerhard Jäger
Hi Olaf,
Post by Olaf Meeuwissen
Hi all,
I just committed the last compiler warning fixes and made the Debian 9
builds "AWARE". Now any compiler warnings on all 4 Debian builds will
bomb the build in question and hence prevent the creation of a new
snapshot tarball.
I mentioned[1] that the plustek-pp backend's indenting defied me but
after some staring at the code I realized it was using a four spaces to
the tab convention. Convincing my editor to do the same made it a lot
easier to understand the intended behaviour but fixing the "mess" was
still a very delicate affair. I had to change the mixed use of spaces
and tabs used to indent to *exactly* match in order to silence compiler
warnings.
[1]
https://lists.alioth.debian.org/pipermail/sane-devel/2017-June/035445.html
[...]
I obviously ignored you - sorry for that. And yes, I did follow at these old
days my own rules, not using tabs, but 4 spaces - in an inconsistent way.
Sorry - 'twas a long time ago ;)
Post by Olaf Meeuwissen
This whole exercise has made me look at the whole code base in a little
more detail and, quite frankly, the use of leading whitespace is a total
and utter mess. Some places are better of than others but on the whole
it's pretty pathetic.
# Just make your editor visually distinguish spaces and tabs and you'll
# see. I used Emacs' whitespace-mode (in its default configuration) and
# it "lit up the place" in a variety of colours.
So here's a few "rules" I'd like to apply in order to address this.
Each file
[...]
Post by Olaf Meeuwissen
# Personally, I would much prefer to uses spaces everywhere the file
# format permits (with a minor execption for files used by make, see
# above) over the board.
# You may find the following blog post of interest ;-)
#
#
https://stackoverflow.blog/2017/06/15/developers-use-spaces-make-money-use-tabs/
If anyone objects, I'm open to suggestions, including "Why bother? Just
use spaces!" ;-)
If I hear nothing, the tools/style-check.sh script will be modified to
implement these rules and can then be use to check for any "violations"
and (recursively) fix them.
Are you sure you want to rework the whole code base? Is it a good idea?
Well maybe for unmaintained backends, but what about the others? In fact
I have here some long pending patch (64 bit awareness) that most probably
no longer apply.
Just my two cents,
Gerhard
BTE: Thanks for caring anyhow.
--
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/sane-devel
Unsubscribe: Send mail with subject "unsubscribe your_password"
--
"well, I stand up next to a mountain- and I chop it down with the edge
of my hand"
--
sane-devel mailing list: sane-***@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/sane-devel
Unsubscribe: Send mail with subject "unsubscribe your_password"
Johannes Meixner
2017-07-13 09:03:49 UTC
Reply
Permalink
Raw Message
Hello,
Post by m. allan noah
1. I routinely use git blame to find out when I changed some line of
code. A massive whitespace commit would wreck that. Yes, there are
other ways to get that info after such a cleanup, but I'm lazy :)
Because there are also various other reasons why 'git blame'
does not show who made substantial code changes, I personally
use meanwhile only

git log -p --follow <filename> | less

to find out who relly changed what in a substantial way
and even more at the same time I can easily see how
a part of code changed over the time (e.g. to find out
possibly "false fixes" in the past).
Post by m. allan noah
2. I've read a great deal of other people's code over the years,
and I am generally stumped by their logic and lack of comments.
Whitespace is rarely a concern, even when they used a weird layout.
I cannot agree more with "stumped by ... lack of comments", cf.
https://github.com/rear/rear/wiki/Coding-Style

In my opinion
first and foremost source code is there to tell others (humans)
what the author had meant that the machine should do.
It is of secondary importance that source code tells the machine
what to do.

Strictly speaking
source code never tells the machine what to do.
Only machine code tells the machine what to do.
Humans make source code.
Tools make machine code from source code.
Source code tells tools what machine code to make.
I used simply "source code tells the machine what to do".

Reasoning:
If source code tells humans what is meant that the machine should do
but that source code tells the machine to do something different
(i.e. there is a bug in that source code), then other humans
can see the bug (i.e. understand the issue) and fix it.
In contrast when source code only tells the machine what to do
nobody can properly fix that source code in case of issues
(at least not with reasonable effort).
When there are issues with such source code, the proper way out is
to replace that useless source code with useful source code that
first and foremost tells humans what is meant that it should do.

Simply put:
Source code that is for machines but not for humans will die out.
Or in other words:
Humans will delete source code that is for machines but not for humans.

See also Eric Raymond's "Rule of Clarity" at
https://en.wikipedia.org/wiki/Unix_philosophy
or directly at
http://www.catb.org/~esr/writings/taoup/html/ch01s06.html#id2877610
Post by m. allan noah
3. There are patches floating around in private repos that may not
apply cleanly after such a change.
On the other had when only using spaces (but no tabs) all
indentations look well even in 'diff' output (i.e. in patches)
which aids readability of patches for humans.



Finally I think Olaf Meeuwissen should do us all a favour and
let make us all more money by using spaces instead of tabs ;-)


Kind Regards
Johannes Meixner
--
SUSE LINUX GmbH - GF: Felix Imendoerffer, Jane Smithard,
Graham Norton - HRB 21284 (AG Nuernberg)
--
sane-devel mailing list: sane-***@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/sane-devel
Unsubscribe: Send mail with subject "unsubscribe your_password"
to sane-devel-***@lists.alioth.debian.org
David Niklas
2017-08-11 04:03:52 UTC
Reply
Permalink
Raw Message
On Wed, 12 Jul 2017 15:35:36 +0200
Gerhard Jäger <***@gjaeger.de> wrote:
<snip>
Post by Gerhard Jäger
Post by Olaf Meeuwissen
This whole exercise has made me look at the whole code base in a
little more detail and, quite frankly, the use of leading whitespace
is a total and utter mess. Some places are better of than others but
on the whole it's pretty pathetic.
Post by Olaf Meeuwissen
# Just make your editor visually distinguish spaces and tabs and you'll
# see. I used Emacs' whitespace-mode (in its default configuration)
and # it "lit up the place" in a variety of colours.
Post by Olaf Meeuwissen
So here's a few "rules" I'd like to apply in order to address this.
Each file
[...]
Post by Olaf Meeuwissen
# Personally, I would much prefer to uses spaces everywhere the file
# format permits (with a minor execption for files used by make, see
# above) over the board.
# You may find the following blog post of interest ;-)
#
#
https://stackoverflow.blog/2017/06/15/developers-use-spaces-make-money-use-tabs/
Post by Olaf Meeuwissen
If anyone objects, I'm open to suggestions, including "Why bother? Just
use spaces!" ;-)
Post by Olaf Meeuwissen
If I hear nothing, the tools/style-check.sh script will be modified to
implement these rules and can then be use to check for any
"violations" and (recursively) fix them.
Are you sure you want to rework the whole code base? Is it a good idea?
Well maybe for unmaintained backends, but what about the others? In fact
I have here some long pending patch (64 bit awareness) that most
probably no longer apply.
<snip>

Isn't this a moot point? Use "diff -w" or it's git alternative (I don't
know what it is).

Sincerely,
David
--
sane-devel mailing list: sane-***@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/sane-devel
Unsubscribe: Send mail with subject "unsubscribe your_password"
to san
Olaf Meeuwissen
2017-07-20 11:49:42 UTC
Reply
Permalink
Raw Message
Hi all,

Apologies for replying to my own, original post but I want to reply to
all initial follow-up in a single post.

First off, Gerhard, Allan, Johannes, many thanks for the feedback.

Your summarized comments with my replies first, followed by in-line
comments on my initial post.

1. The whole code base? Is that worth it?

In principle, yes, the whole code base unless backend maintainers
balk. Whether it's worth it, I don't know. I only know that at
least I will feel better (but that's not all important, of course).
As a self-appointed janitor I get to deal with a larger part of the
code base than the average backend maintainer. Switching between
space/tab conventions on a per line basis and indenting and coding
styles between blocks (at best!) sometimes feels like a juggling
job.
# And my hand-eye coordination leaves to be desired ;-)

2. How about unmaintained backends only?

That might be an option but please note that the maintained status
was decided based on "backend contributor mentioned in the AUTHORS
file with commit privileges" without regard for actual status.

@Gerhard> Are you still *maintaining* the plustek and plustek-pp
backends? If yes, how about you clean up ;-)?

3. Wouldn't that break people's patches?

Get them upstreamed. Seriously, what's keeping everyone's generally
usable private patches from getting included?
In case your patches don't make it in for some reason, have a good
look at the documentation for the --ignore-whitespace options. Both
git am and git apply as well as the patch program support it.

4. But that makes git blame harder to use!

git blame is only really useful if your UI let's you quickly jump to
the changeset in question and move to the one before it with ease.
If it doesn't, Johannes' git log --follow -p is magnitudes better.
# BTW, I use both approaches, judiciously.

5. Whitespace issues haven't really bothered me.

That's you. They do bother *me*. That's why I brought this up in
the first place ;-)
Clean, consistently formatted code is a lot easier and inviting to
work with. A SANE approach to leading whitespace at the file level
is a start. Pun intended.
As you mentioned, such code may make it easier for folks to pick up
the maintenance of a backend. If so, we're off for the better.
# As a "janitor", I'm just trying to fix "broken windows" and patch
# up "cracks in the walls" of the sane-backends apartment building.

6. Spaces-only leads to more intelligible diffs.

ACK. Less weird indenting when looking at the diff.

7. Spaces everywhere please, I want to get rich quick ;-)

Let's combine that with a one-space indent style so we can inject a
pyramid scheme and get rich quicker yet :-P

And now for some observations pouring over my inital post and the
current state of white space use.
[... fixing "hair-raising" -Wmisleading-indentation issues ...]
This whole exercise has made me look at the whole code base in a little
more detail and, quite frankly, the use of leading whitespace is a total
and utter mess. Some places are better of than others but on the whole
it's pretty pathetic.
OK, so that may have been a bit over the top, but, hey, I want to grab
everyone's attention.
So here's a few "rules" I'd like to apply in order to address this.
Each file
- uses either spaces or tabs for *all* of its leading whitespace
This is *not* on a per line basis, this applies to the *whole* file.
- if using tabs, these tabs may be followed by up to 7 spaces
This is to accommodate n-space indent policies (where n mod 8 != 0)
and assumes eight spaces to the tab.
- if using tabs, uses eight spaces to the tab
- if to be used by `make`, an initial whitespace character shall be
a tab, independent of whether its on a continuation line or not
Let's add ChangeLogs/ChangeLog-1.0.$n for n < 26 to the files that have
a leading tab. Most of them are pretty close to that already.

I've written (and attached) a little script that analyses every file you
throw at it following something close to the above. Please have a look
at the script to get an idea of what is does and what the output is all
about.

Suggested use:

git ls-files | xargs -n 1 tabs-and-spaces.sh
The "rules" above are inspired by a combination of consistency, ease of
checking/fixing and giving developers some leeway in applying their own
preferences to their code.
It turns out that the checking is the easy part. The fixing can be a
pain in the behind and may require a fair bit of manual intervention :-(
Whether to use spaces or tabs for each file will be decided on the basis
of a count of tabs and spaces (and in case of doubt comparison with any
related files so as to keep some kind of consistency between them). The
criterion will be a majority of one over the other (taking into account
the number of spaces to a tab). So, with n spaces to the tab, the
larger value of n*number_of_leading_tabs and number_of_leading_spaces
will decide the leading whitespace policy for a file.
This turns out to be not such a clear cut issue. I'll need to study the
output of my script in a more detail and perhaps apply some case-by-case
logic.
If I hear nothing, the tools/style-check.sh script will be modified to
implement these rules and can then be use to check for any "violations"
and (recursively) fix them.
I did hear something, so I'll think things over a bit more and may end
up doing something in a branch so that we can all take a look and decide
what's for the better.

Hope this helps,
--
Olaf Meeuwissen, LPIC-2 FSF Associate Member since 2004-01-27
GnuPG key: F84A2DD9/B3C0 2F47 EA19 64F4 9F13 F43E B8A4 A88A F84A 2DD9
Support Free Software https://my.fsf.org/donate
Join the Free Software Foundation https://my.fsf.org/join
m. allan noah
2017-07-20 13:07:27 UTC
Reply
Permalink
Raw Message
Olaf, you make a pretty convincing case, especially as regards to your
wider than average exposure to the codebase. If you feel strongly
about this, and are willing to do the work (which I think will be
harder than you expect), then I will remove my objections.

allan
--
sane-devel mailing list: sane-***@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/sane-devel
Unsubscribe: Send mail with subject "unsubscribe your_password"
to sane-devel-***@lists.alioth.debian.org
Gerhard Jäger
2017-07-21 09:56:19 UTC
Reply
Permalink
Raw Message
Well - Olaf,

seems nothing can stop you from doin' that ;)
I second Allan, so go on. If I ever will apply my changes, I think I can
live with it.

Cheers,
Gerhard
Post by m. allan noah
Olaf, you make a pretty convincing case, especially as regards to your
wider than average exposure to the codebase. If you feel strongly
about this, and are willing to do the work (which I think will be
harder than you expect), then I will remove my objections.
allan
--
sane-devel mailing list: sane-***@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/sane-devel
Unsubscribe: Send mail with subject "unsubscribe your_password"
to sane-devel-***@lists.alioth.debian.org
Olaf Meeuwissen
2017-07-21 11:50:51 UTC
Reply
Permalink
Raw Message
Hi,
Post by Gerhard Jäger
Well - Olaf,
seems nothing can stop you from doin' that ;)
Convincing counter-arguments can but I can be hard to convince ;-)
Post by Gerhard Jäger
I second Allan, so go on. If I ever will apply my changes, I think I can
live with it.
Cheers,
Gerhard
Post by m. allan noah
Olaf, you make a pretty convincing case, especially as regards to your
wider than average exposure to the codebase. If you feel strongly
about this, and are willing to do the work (which I think will be
harder than you expect), then I will remove my objections.
Hard? No. A lot of work? Sure. More work than I thought it would be?
I've already realized that. Am I deterred by that? Not at all.

There were several hundred compiler warnings on Debian GNU/Linux 8.x in
the past. They're gone now. The new ones, some 50 or so?, that popped
up in 9.0 are a thing of the past now too. Warnings on Alpine are down
too and I'm sensing they're being gone is within reach. Fedora is on
the list next ;-)

Anyways, with all concerns of the two of you out of the way now, I'll
have a good look at it again and clean up things starting with the low
hanging fruit. I'll also add policy checking at a warning level so we
can all see where we're at in the CI build logs.

As long as there are tools that can check this "boring" stuff for you,
it is relatively easy keep things in order.

Hope this helps,
--
Olaf Meeuwissen, LPIC-2 FSF Associate Member since 2004-01-27
GnuPG key: F84A2DD9/B3C0 2F47 EA19 64F4 9F13 F43E B8A4 A88A F84A 2DD9
Support Free Software https://my.fsf.org/donate
Join the Free Software Foundation https://my.fsf.org/join
--
sane-devel mailing list: sane-***@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/sane-devel
Unsubscribe: Send mail with subject "unsubscribe your_password"
to sane-devel-***@lists.alioth.debian.org
Loading...