Skip to content

feat: add golicenser linter #5751

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

joshuasing
Copy link

@joshuasing joshuasing commented Apr 19, 2025

Add golicenser (https://github.com/joshuasing/golicenser) linter.

golicenser aims to be a fast, easy-to-use and highly customisable linter for managing license headers.

I have done my best to make sure this pull request meets the requirements for new linters, however if I have missed anything, please let me know.

I am the author and maintainer of golicenser, and I would love to see it included in golangci-lint. I am more than happy to make any necessary changes to this pull request or golicenser itself, and any feedback would be greatly appreciated. Thank you.

Copy link

boring-cyborg bot commented Apr 19, 2025

Hey, thank you for opening your first Pull Request !

@CLAassistant
Copy link

CLAassistant commented Apr 19, 2025

CLA assistant check
All committers have signed the CLA.

@ldez
Copy link
Member

ldez commented Apr 19, 2025

In order for a pull request adding a linter to be reviewed, the linter and the PR must follow some requirements.

  • The CLA must be signed

Pull Request Description

  • It must have a link to the linter repository.
  • It must provide a short description of the linter.

Linter

  • It must not be a duplicate of another linter or a rule of a linter (the team will help to verify that).
  • It must have a valid license (AGPL is not allowed), and the file must contain the required information by the license, ex: author, year, etc.
  • It must use Go version <= 1.23.0
  • The linter repository must have a CI and tests.
  • It must use go/analysis.
  • It must have a valid tag, ex: v1.0.0, v0.1.0.
  • It must not contain init().
  • It must not contain panic().
  • It must not contain log.Fatal(), os.Exit(), or similar.
  • It must not modify the AST.
  • It must not have false positives/negatives (the team will help to verify that).
  • It must have tests inside golangci-lint.

The Linter Tests Inside Golangci-lint

  • They must have at least one std lib import.
  • They must have integration tests without configuration (default).
  • They must have integration tests with configuration (if the linter has a configuration).

.golangci.next.reference.yml

  • The file .golangci.next.reference.yml must be updated.
  • The file .golangci.reference.yml must NOT be edited.
  • The linter must be added to the lists of available linters (alphabetical case-insensitive order).
    • enable and disable options
  • If the linter has a configuration, the exhaustive configuration of the linter must be added (alphabetical case-insensitive order)
    • The values must be different from the default ones.
    • The default values must be defined in a comment.
    • The option must have a short description.

Others Requirements

  • The files (tests and linter) inside golangci-lint must have the same name as the linter.
  • The .golangci.yml of golangci-lint itself must not be edited and the linter must not be added to this file.
  • The linters must be sorted in the alphabetical order (case-insensitive) in the lintersdb/builder_linter.go and .golangci.next.reference.yml.
  • The load mode (WithLoadMode(...)):
    • if the linter uses goanalysis.LoadModeSyntax -> no WithLoadForGoAnalysis() in lintersdb/builder_linter.go
    • if the linter uses goanalysis.LoadModeTypesInfo, it requires WithLoadForGoAnalysis() in lintersdb/builder_linter.go
  • The version in WithSince(...) must be the next minor version (v2.X.0) of golangci-lint.
  • WithURL() must contain the URL of the repository.

Recommendations

  • The file jsonschema/golangci.next.jsonschema.json should be updated.
  • The file jsonschema/golangci.jsonschema.json must NOT be edited.
  • The linter repository should have a readme and linting.
  • The linter should be published as a binary (useful for diagnosing bug origins).
  • The linter repository should have a .gitignore (IDE files, binaries, OS files, etc. should not be committed)
  • A tag should never be recreated.
  • use main as the default branch name.

The golangci-lint team will edit this comment to check the boxes before and during the review.

The code review will start after the completion of those checkboxes (except for the specific items that the team will help to verify).

The reviews should be addressed as commits (no squash).

If the author of the PR is a member of the golangci-lint team, he should not edit this message.

This checklist does not imply that we will accept the linter.

@ldez ldez self-requested a review April 19, 2025 11:51
@ldez ldez added the linter: new Support new linter label Apr 19, 2025
@ldez ldez changed the title feat(golinters): add golicenser Add golicenser linter Apr 19, 2025
@ldez ldez added the feedback required Requires additional feedback label Apr 19, 2025
@ldez
Copy link
Member

ldez commented Apr 19, 2025

This linter is obviously a duplicate of goheader.

I would have preferred an enhancement of goheader, but I also understand it is a full rewrite from scratch.

Currently, goheader is limited and has some design problems.

I note a "missing" option compared to goheader: template-path.
Is it a choice?

@ldez
Copy link
Member

ldez commented Apr 19, 2025

Also, there is a major and problematic difference with goheader: the comment header style is not respected.
The license headers are always converted to // instead of using the initial comment style (ex /* */).

If a fix is applied, the following comments are converted to //:

/*
	Copyright (c) 2025 golangci-lint <test@example.com>.
	This file is a part of golangci-lint.
*/
/*
Copyright (c) 2025 golangci-lint <test@example.com>.
This file is a part of golangci-lint.
*/
/* Copyright (c) 2025 golangci-lint <test@example.com>.
This file is a part of golangci-lint. */

Another problem: the whitespaces inside the template are not respected if they are at the beginning of a sentence.

linters:
  settings:
    golicenser:
      header:
        template: |-
            Copyright (c) {{.year}} {{.author}} <{{.email}}>.
            This file is a part of {{.projectname}}.

(There are extra spaces before each sentence.)

@ldez ldez changed the title Add golicenser linter feat: add golicenser linter Apr 19, 2025
@ldez
Copy link
Member

ldez commented Apr 19, 2025

The linter has the same problem as goheader with the starred header:

/*
 * header text
 * header text
 */

The linter fixes remove build directives:

//go:build unix && !foo

package testdata

The linter only checks headers that match the template: this is an unexpected behavior.

@joshuasing
Copy link
Author

joshuasing commented Apr 19, 2025

This linter is obviously a duplicate of goheader.

I would have preferred an enhancement of goheader, but I also understand it is a full rewrite from scratch.

Currently, goheader is limited and has some design problems.

Agreed, sorry I meant to point that out in the PR but forgot. I had originally used goheader for many of my projects, and ended up writing golicenser as a replacement due to said limitations and design problems. Ideally, golicenser is meant to replace goheader in many ways.

I note a "missing" option compared to goheader: template-path. Is it a choice?

Hmm, this is implemented in the golicenser CLI instead of the analyzer itself, so I could implement it in golangci-lint in the same way to support this.


Also, there is a major and problematic difference with goheader: the comment header style is not respected. The license headers are always converted to // instead of using the initial comment style (ex /* */).

golicenser supports setting a "comment style", where either "line" or "block" can be used - the idea of this was to help keep all license headers as consistent as possible. Perhaps a "preserve" comment style, which behaves in this way, could be a better default?

Another problem: the whitespaces inside the template are not respected if they are at the beginning of a sentence.

That's interesting, I will look into this. I am sure I have tested this without the yaml config and it worked before, so I wonder if it's some parsing difference 🤔

The linter has the same problem as goheader with the starred header:

/*
 * header text
 * header text
 */

Ah, comment style "block" will render as:

/*
comment
*/

to match how Kubernetes' license headers are formatted. The "preserve" comment style suggested above could fix this as well, otherwise a separate comment style for rendering the star on each line would work 🤔

The linter fixes remove build directives:

//go:build unix && !foo

package testdata

That is definitely not intentional, will fix. Thank you.

The linter only checks headers that match the template: this is an unexpected behavior.

The linter should check headers against two regexps: the copyright header regexp, then the license header matcher:

  • If the comment does not match the copyright header regexp (by default: (?i)copyright), the file is treated as missing a license header and a new license header will be generated.
  • If it matches the copyright header regexp, but not the header matcher, it is assumed the file has a different license/copyright header and thus should be ignored.
  • If it matches the copyright header regexp and the header matcher, then it will be updated if the actual comment is different from the rendered template.

Is this the observed behaviour, and does this behaviour make sense?

@ldez
Copy link
Member

ldez commented Apr 19, 2025

I think that comment-style is better than preserving the style, but it should be extended to support starred headers.

The linter should check headers against two regexps: the copyright header regexp, then the license header matcher:

This is unintuitive, so too complex IMHO.

@ldez
Copy link
Member

ldez commented Apr 19, 2025

Additional context: I was discussing with the goheader author, before this PR, about a rewrite of goheader: denis-tingaikin/go-header#45 (comment)

And I notified the author in DM about this PR.

@joshuasing
Copy link
Author

joshuasing commented Apr 19, 2025

I think that comment-style is better than preserving the style, but it should be extended to support starred headers.

Okay, will do.

The linter should check headers against two regexps: the copyright header regexp, then the license header matcher:

This is unintuitive, so too complex IMHO.

Hm okay. The reason for this was primarily so that projects that contain files with different licenses could ignore such files easily, however these could also be excluded individually, I suppose. I will rewrite this.

Additional context: I was discussing with the goheader author, before this PR, about a rewrite of goheader: denis-tingaikin/go-header#45 (comment)

And I notified the author in DM about this PR.

Great. My main goal is to end up with a license header linter which is easy to use and works for any project, so if there's anything I can do to help achieve that, please let me know.

Thanks for all of the feedback so far!

@joshuasing
Copy link
Author

joshuasing commented Apr 19, 2025

Added template-path configuration option, which will load the file and pass its contents (with ending newline removed) to golicenser: 4f6bb90 (#5751)

I have reproduced the build directive issue locally: The ast.File comments include the directive comments, however golicenser does not appear to replace it unless the directive comment matches the copyright header matcher 🤔 I will figure out a way to make it ignore directives entirely.

@ldez
Copy link
Member

ldez commented Apr 19, 2025

You must remove this: https://github.com/joshuasing/golicenser/blob/e9a45b2af8f63428fd666fa8c6425f0f8b82ac8d/analysis.go#L51-L53

testdata is a special name; the Go tooling already excludes it.

The go tooling ignores directories starting with ., _, or named testdata.

The go tool will ignore a directory named "testdata", making it available to hold ancillary data needed by the tests.
https://pkg.go.dev/cmd/go#hdr-Test_packages

https://github.com/golang/go/blob/63ba2b9d84dede1df107db30b4ff8139711402eb/src/cmd/go/internal/modindex/read.go#L386-L388

https://github.com/golang/go/blob/63ba2b9d84dede1df107db30b4ff8139711402eb/src/go/build/build.go#L623-L625

@joshuasing
Copy link
Author

joshuasing commented Apr 20, 2025

You must remove this: joshuasing/golicenser@e9a45b2/analysis.go#L51-L53

testdata is a special name; the Go tooling already excludes it.

The go tooling ignores directories starting with ., _, or named testdata.

The go tool will ignore a directory named "testdata", making it available to hold ancillary data needed by the tests.
pkg.go.dev/cmd/go#hdr-Test_packages

golang/go@63ba2b9/src/cmd/go/internal/modindex/read.go#L386-L388

golang/go@63ba2b9/src/go/build/build.go#L623-L625

Thanks for pointing this out. I have removed it here: joshuasing/golicenser#10


For this note: a314c0e (#5751)

Setting MaxConcurrent to 1 currently results in a goroutine being created in each Run call, so I have changed the behaviour in golicenser to process files synchronously when MaxConcurrent <= 1 in joshuasing/golicenser#11.

MaxConcurrent also now defaults to 0 (meaning no concurrency), so it can be removed from the config if wanted.


I have created joshuasing/golicenser#12 to add a "starred-block" comment style, which renders as:

/*
 * header content
 */

The pull request also improves comment parsing to properly handle such comments, as well as changing the header detection to ignore directive comments and make sure they are never replaced.


I will release the above changes as v0.4.0 and update this pull request soon.

Note: I have also moved the position used in diagnostics to file.Package, which seems to work far nicer with analysistest as it no longer requires putting the "want" comment on the first line and trying to ignore it in the matcher. I believe this also makes the diagnostics more useful, however please let me know your thoughts on this change.

Any feedback is highly appreciated, thank you.

@ldez
Copy link
Member

ldez commented Apr 20, 2025

For me, the current configuration is unintuitive and too complex.

IMHO, the following options should be removed:

  • copyright-header-matcher
  • header
  • header.author
  • header.author-regexp
  • header.matcher
  • header.matcher-escape

Also, It could be great to not use the default template markers {{/}}.

I think your goal is to handle multiple headers (not currently implemented).
But it is unclear if your goal is to handle either multiple header in the same file or different header by file

I suggest the following configurations:

  1. simple case (no multiple headers)
golicenser:
  comment-style: "line"
  template: |-
    Copyright (c) [[.year]] [[.author]]. All rights reserved.
    Use of this source code is governed by a BSD-style
    license that can be found in the LICENSE file.
#  template-path: /path/to/my/template.tmpl
  year-mode: "git-range"
  variables:
    author:
      match: "(My Name|Your Name)"
      replacement: "My Name"
    email:
      match: "(.+)@mycompany\\.com"
      replacement: "foo@example.com"
    project-name:
      value: "My project"
  1. complex case (multiple headers)
golicenser3:
  default-header: bsd # if there is only one header definition, use it as default.
  headers: # map[string]Options
    bsd:
      matcher: 'BSD-style' # Matches on existing headers. (Empty -> all files)
      comment-style: "line"
      template: |-
        Copyright (c) [[.year]] [[.author]]. All rights reserved.
        Use of this source code is governed by a BSD-style
        license that can be found in the LICENSE file.
      #  template-path: /path/to/my/template.tmpl
      year-mode: "git-range"
      variables:
        author:
          match: "(My Name|Your Name)"
          replacement: "My Name"
        email:
          match: "(.+)@mycompany\\.com"
          replacement: "foo@example.com"
        project-name:
          value: "My project"

    apache:
      matcher: 'Apache License' # Matches on existing headers. (Empty -> all files)
      comment-style: "line"
      template: |-
        Copyright [[.year]] [[.author]]
  
        Licensed under the Apache License, Version 2.0 (the "License");
        you may not use this file except in compliance with the License.
    #  template-path: /path/to/my/template.tmpl
      year-mode: "git-range"
      variables:
        author:
          match: "(My Name|Your Name)"
          replacement: "My Name"
        email:
          match: "(.+)@mycompany\\.com"
          replacement: "foo@example.com"
        project-name:
          value: "My project"

OR

golicenser3:
  default-header: bsd # if there is only one header definition, use it as default.
  headers: # map[string]Options
    bsd:
      matcher: my/package/one # Match on the package path. (Empty -> match all except other rules)
      comment-style: "line"
      template: |-
        Copyright (c) [[.year]] [[.author]]. All rights reserved.
        Use of this source code is governed by a BSD-style
        license that can be found in the LICENSE file.
      #  template-path: /path/to/my/template.tmpl
      year-mode: "git-range"
      variables:
        author:
          match: "(My Name|Your Name)"
          replacement: "My Name"
        email:
          match: "(.+)@mycompany\\.com"
          replacement: "foo@example.com"
        project-name:
          value: "My project"

    apache:
      matcher: my/package/two # Match on the package path. (Empty -> match all except other rules)
      comment-style: "line"
      template: |-
        Copyright [[.year]] [[.author]]
  
        Licensed under the Apache License, Version 2.0 (the "License");
        you may not use this file except in compliance with the License.
    #  template-path: /path/to/my/template.tmpl
      year-mode: "git-range"
      variables:
        author:
          match: "(My Name|Your Name)"
          replacement: "My Name"
        email:
          match: "(.+)@mycompany\\.com"
          replacement: "foo@example.com"
        project-name:
          value: "My project"

@joshuasing
Copy link
Author

For me, the current configuration is unintuitive and too complex.

IMHO, the following options should be removed:

  • copyright-header-matcher
  • header
  • header.author
  • header.author-regexp
  • header.matcher
  • header.matcher-escape

Thanks for the feedback! I agree it is currently too complex, and many improvements can be made.

For these options:

  • copyright-header-matcher can be removed as this doesn't really need to be configurable, users are more likely to break things with this I think.
  • header can be collapsed into the primary options - Within golicenser, the header rendering portion is separated (and thus has HeaderOpts) to separate concerns, which could also be improved.
  • I think header.author and header.author-regexp can be removed and a variable can be used instead. These existed prior to me adding matcher/regexp support for variables.
  • header.matcher (and header.matcher-escape) exists to detect if the file's license header is the "project's license", and ignore it if not, assuming the file is licensed differently. I think this could be removed, users will just need to add excludes for each file/dir that is licensed differently.

I will rewrite the matching system and implement these changes during the week.

Also, It could be great to not use the default template markers {{/}}.

As in change the text/template delimiters to [[ and ]] or similar, or change from using text/template to a "custom" rendering system? text/template seems to take ~4-5ms per file, which could likely be beaten with a custom replacer-like function, however I like the level of control it provides.

I think your goal is to handle multiple headers (not currently implemented). But it is unclear if your goal is to handle either multiple header in the same file or different header by file

golicenser was primarily designed to lint the project's own license headers, since most projects only have one license. I could implement support for multiple headers, which I suppose allows projects to enforce different headers/licenses in different areas of the codebase. 🤔

How useful would this be and how often would this likely be used?

Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feedback required Requires additional feedback linter: new Support new linter
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants