Sonarqube

Table of Contents

sonar client/server architecture
how to use sonarqube?
- defining what “projectVersion” means
- bootstrapping sonar on a project
sonar projectVersion
Notes linking here
Permalink

External reference: https://en.wikipedia.org/wiki/SonarQube

SonarQube - Wikipedia

SonarQube (formerly Sonar) is an open-source platform developed by SonarSource for continuous inspection of code quality to perform automatic reviews with static analysis of code to detect bugs, code smells, and security vulnerabilities on 20+ programming languages.

sonar client/server architecture

This is what I guessed.

on a CI or the computer of a developer, sonar-scanner is run. It scans the code, gets the SCM annotations of the code (among other things the date of the last commit that brought each line) and sends the analyses and annotated code to the server¹, associated with a sonar.projectVersion value. Hereafter, we call this analysis the current version. Its date is either the date of scan or whatever you put in the projectDate property.
the server stores the in a stack. The order of the sent analyses impacts how sonar will behave, so don’t send a past analysis (with sonar.projectDate) and rather create a new project from scratch.
the server finds a date after which it considers the code as being new,
- either using a number of days from the date of the current version,
- or using a fixed date
- or finding another analyses as base and using its date
  - if using the new code as being from the previous version, it checks the most upper analysis of the stack whose projectVersion is different from the current version ².
  - if using the new code as being from a fixed version, it find from the top of the stack the last analysis with that version.
the server now has an interval of time (two dates) that define the temporal interval of new code,
in the current version analyses, sonar finds all the lines whose annotation are in this interval -> those become the new code.

how to use sonarqube?

I feel like sonarqube is a great tool. But simply using it without understanding the method would be pointless. So here is what I understood.

defining what “projectVersion” means

It follows the idea of clean as you code, meaning it strongly emphasizes the reporting on new code.

Hence you need to make sure you correctly configure the new code definition.

In maven project, there are some rites, like mvn release:XXX that make the pom.xml contain some version number. sonar appears to be automatically compatible only with those rites. In other projects I have seen, there are other rites.

I like using versioneer in my python projects to not think much about the version number. I simply work on the project, letting the CI tell me how well the code base is. When I have something that has enough features and whose quality is good, I simply git tag the code base.

Then main difference with the maven-ish workflow is that giving a release name (concreted by a git tag) is the last thing I do, even after the tests are run. I like this workflow, because it is very compatible with semver and does not need to plan as will be in the tag beforehand: I name the tag by looking at what as landed inside it. If I planned to release a minor version but in the mean time I realize I will definitely need an intermediate bugfix version, I don’t need to change anything in my workflow.

Sonar’s new code definition may be based on

some days ago
some reference branch
provided version changes

I don’t use 1. as I don’t use regular releases. 2. would work for a big code base, with a main branch (see best practices to use gitflow) but does not fit well most of my projects. Only the version based definition remains.

Because git tag is my source of truth, I don’t want to repeat the information in a sonar-project.properties file (DRY). I cannot use a file as the source of truth because of the arguments above.

Hence, I setup my workflows to provide the version dynamically. It ends up like:

sonar-scanner -Dsonar.projectVersion=$(git tag --sort=creatordate --merged|grep '^v'|tail -1)

This git command finds out the last tag available from the current branch that begins with the letter v. I use the prefix v to differentiate real releases and intermediate releases. I want sonar to scan only the former.

At first, it looks like this works well. After each tag, the new version is acknowledged into sonar reports.

But it actually has a drawback: the reports look like there is a delay between the version number and the tag.

Imagine I create two commits: a and b. Then imagine that I want to tag the version T1, then I commit c and then d.

The reports in sonar would be

a: no version
b: no version (I did not tag yet)
c: T1
d: T1

I would rather like to have reports showing that a and b are part of T1 and c and d part of some next release. The ideal reports should look like:

a: T1
b: T1
c: next
d: next

Imagine now that I commit e and f. Now, the reports look like this.

a: no version
b: no version
c: T1
d: T1
e: T2
f: T2

To me, it looks like it shows that e and f are part of T2, while T2 was tagged on the commit d. d, in turn, looks like it belongs to T1.

To me, this is counter intuitive, I would rather have something like.

a: T1
b: T1
c: T2
d: T2
e: next
f: next

Note that apart from that counter-intuitive naming, the report actually make sense:

when working on a and b, I can see a report about the whole code
when working on c and d, I can see a report about the difference with b
when working on e and f, I can see a report about the difference with d

This is exactly what we want: to define new code as what differs from the tag. This is great.

A way to mitigate this counter-intuitive naming would be to provide a post-XXX version.

So I guess a more sonarish way of naming version should be.

sonar-scanner -Dsonar.projectVersion=post-$(git tag --sort=creatordate --merged|grep '^v'|tail -1)

That way, the reports would look like:

a: no version
b: no version
c: post-T1
d: post-T1
e: post-T2
f: post-T2

That would convey the meaning that c and d are after T1 and not part of T1.

Another idea would be to retrospectively rename the versions so that they match the expected values.

In the CI, I could have a fixed projectVersion=next, and each time I tag, I could use the sonar API to rename next into the tag value.

That way the following reports,

a: next
b: next

Would be renamed after tagging T1 into:

a: T1
b: T1

Then,

a: T1
b: T1
c: next
d: next

Would be renamed after tagging T2 into:

a: T1
b: T1
c: T2
d: T2

And finally,

a: T1
b: T1
c: T2
d: T2
e: next
f: next

Would become, after tagging T3 on f

a: T1
b: T1
c: T2
d: T2
e: T3
f: T3

etc.

This is what makes most sense to me as the version contains the commits that eventually belong to the tag.

So far, I won’t invest time in automatizing this, and I dislike having reports that look like post-XXX, so I guess I just need to put in a post-it somewhere that “sonar report T1 == code in T2”.

bootstrapping sonar on a project

If you are using sonar in an old project with already much technical debt, you won’t want to face all that debt at once.

In that case, you can simply setup sonar and let the tool show you technical debt piece by piece during the day to day work (see clean as you code).

In case your project is new, you might want to deal with this technical from the beginning.

The thing is that sonar compares analyses. That means that the first analysis you submit to it will be your reference. In case you created a bunch of initial code before analysing it, its technical debt will be in the “Overal code” section but not in “New code”.

Then, if you want to bootstrap a new project, I warmly suggest you run an initial sonar-scanner in an empty directory, so that sonar will now this is the reference.

Beware that this might not be enough. In case you already have code written in the past (meaning git commits with committer date in the past), those won’t be considered as new code (see ambiguity in computing the new code in sonarqube).

In that case, you have to make the first empty analysis be in the past as well, using long enough in the past.

sonar projectVersion

Notes linking here

Permalink

SonarQube relies on “blame” data from your SCM repository to understand which code is “new”. If no SCM data is available, then no code can be marked new and thus no metrics “on new code” can be calculated. If you notice, you also have no duplications, technical debt, bugs or vulnerabilities “on new code”.

— https://groups.google.com/g/sonarqube/c/VCV77hLwsNE?pli=1
↩︎
Changing the version label does not mean that there is new code. SQ core is looking in the SCM LOG files if there is new/changed code:

version 1=TRT-TMN_0.1.0.54 => date/time 1 in SCM LOG file version 2=TRT-TMN_0.1.0.62 => date/time 2 in SCM LOG file ==> new / changed items in SCM LOG file between these two points in time

— https://github.com/SonarOpenCommunity/sonar-cxx/issues/1786
↩︎