Welcome to the Inedo Forums! Check out the Forums Guide for help getting started.

If you are experiencing any issues with the forum software, please visit the Contact Form on our website and let us know!

SPDX license expressions



  • Hi there!

    In our ongoing quest to identify the licenses of all packages used in our products (and, eventually, blocking all packages with unknown licenses), we came across license expressions that combine several licenses. The syntax for such combinations is documented here: https://spdx.github.io/spdx-spec/v2-draft/SPDX-license-expressions/

    Take for example the npm package "atob" (https://www.npmjs.com/package/atob). It is licensed under both, the MIT and Apache 2 license. Consequently, the corresponding license expression is (MIT OR Apache-2.0). Obviously, this is not recognized by Proget out of the box.

    Now, we could define new licenses for every combination of two licenses that we come across. However, I was wondering whether it would make sense to support at least a simple subset of the SPDX license expression syntax, specifically the AND and OR key words. License filtering could then be implemented as follows:

    X AND Y: Allow downloading package if both X and Y are allowed.
    X OR Y: Allow downloading if at least one of the licenses is allowed.

    X and Y could be simple SPDX license tags or another disjunctive or conjunctive license expression.

    Would such an approach make sense? It probably would not trivial to implement and I honestly don't know how many packages use such combining expressions (I haven't come across a lot of them yet), so I don't know if it would be worth the effort. Would be nice to hear how everyone else is dealing with such packages.


  • inedo-engineer

    I'm curious to know too if anyone else is interested in this.

    FYI: we were intentionally "lazy" when we came across these, because the expressions got pretty complex (in the specs) and it seemed more suitable for a human to determine what WITH and this and that meant.

    ProGet already supports multiple licenses per packages (they are treated as an OR), but we thought it might be unintuitive to only support OR, so we just left it as is.



  • Hi @apxltd

    I didn't realizte ProGet actually already supports multiple licenses. In that case it might make sense to support just the OR operator for now (and see if there is an actual need to support other operators or more complex expressions).

    In fact, one could argue that the AND operator actually creates a new license (the combination of the two or more licenses involved), and that it makes sense to treat the combination as a completely new license, while the OR operator offers you a chance to choose one of the licenses. Same goes for the WITH operator: I don't mind treating License Abc WITH exception Xyz as a completly new and different license. So it might make sense to treat OR different to AND or WITH.

    The only operator that is similar to the OR operator would be the + operator. Wouldn't it be great if ProGet "knew" that LGPL-2.1 and LGPL-3.0 are newer versions of LGPL-2.0 and that LPGL-2+ means "LGPL-2.0 or later" (effectively LGPL-2.0 OR LGPL-2.1 OR LGPL-3.0)?


  • inedo-engineer

    Hi @sebastian ,

    FYI; for now, we'll plan to add support for OR when reading a SPDX from a manifest; we'll add this to the "nice to haves" in PG2023!

    Cheers,
    Alana


  • inedo-engineer

    Hi @sebastian

    Just an update; this was committed to the PG2023 code base, and seems to work on a few packages I tried (but I can't find too many).

    Basically we just will enumerate the code matches in this Regex: ^\(?(( OR )?(?<code>[0-z-_\.]*))+\)?$

    That will catch (a OR b) and a OR b, but if one were to be silly and add ((a) OR (b)) then it would revert back.

    Doesn't seem worth getting any more complex than that :)



  • @atripp said in SPDX license expressions:

    Hi @sebastian

    Just an update; this was committed to the PG2023 code base, and seems to work on a few packages I tried (but I can't find too many).

    Basically we just will enumerate the code matches in this Regex: ^\(?(( OR )?(?<code>[0-z-_\.]*))+\)?$

    That will catch (a OR b) and a OR b, but if one were to be silly and add ((a) OR (b)) then it would revert back.

    Doesn't seem worth getting any more complex than that :)

    Thanks for the update on this! I don't think we will see a lot of complex combinations in the wild that are pure OR disjunctions, but just in case someone does something stupid, I took the liberty to slightly modify your Regex. How about this: ^(?>((\s+OR\s+)?(?<c>\()*(?<code>[0-z-_\.]+)(?<-c>\))*)+)(?(c)(?!))$

    The \s+ before and after the OR will allow any amount of whitespaces, and the (?<c>\(), (?<-c>\)) and (?(c)(?!)) parts use balancing groups to allow any number of matching brackets (i.e. same amount of opening and closing brackets). Also, I replaced (?<code>[0-z-_\.]*) with (?<code>[0-z-_\.]+) to make sure there are no empty code matches.

    I haven't been able to come up with a valid SPDX expression that could not be matched by that Regex, so I think this should pretty much cover it. Only thing that one could add would be to allow leading and trailing whitespaces.


  • inedo-engineer

    Thanks @sebastian !! Your RegExFu is impressive 😂

    Definitely more robust - so I replaced it, and now we will match that "silly" test case of ((a) OR (b))



  • @atripp said in SPDX license expressions:

    Thanks @sebastian !! Your RegExFu is impressive 😂

    Thanks, but to be honest: it's mainly just trial and error 😂



  • Hi @atripp,

    I just tested the implementation of this with ProGet 2023.1 with the aforementioned atob npm package. The filtering works perfect. The package uses "MIT OR Apache-2.0", and as long as at least one of those two licenses is configured as allowed, the package can be downloaded. Only when both licenses are configured as "blocked", the package is also blocked. This works 100% as expected!

    When I check the general page of the atob package, "License Information" on the "Overview" tab displays both licenses and their corresponding blocking configurations correctly.

    However, when I go to a specific version, the version's "Overview" tab will always state This package has a MIT license, and may be used because of configured license filtering policies, even if MIT is actually blocked and only Apache-2.0 is allowed. This only changes when both licenses are blocked (In which case the page states Packages with the MIT license cannot be downloaded due to a global license rule).

    Looks like this is just optics. As I said, the blocking itself seems to work exactly as expected.


Log in to reply
 

Inedo Website HomeSupport HomeCode of ConductForums GuideDocumentation