Conda channels should also add the "constrains" from a package's index file to repodata.json

e.rotteveel_1850 · 10 Feb 2023, 15:18

I noted that ProGet's conda channels does not add the constrains key from packages index-data to their index.

Conda packages have a very useful feature called "run_constraints". These are in fact optional dependencies, and when a user installs them conda will install the version & build that is specified in a package's constrains key.

Basically, ProGet should take the constrains key from a package's index.json file and add it to the package records in the repodata.json files. In a similar way to what is already done for the depends key.

I can supply a example package if needed.

atripp · 14 Feb 2023, 06:38

Hi @e-rotteveel_1850 ,

Thanks for the suggestion! We could definitely use your help in getting a few more details on how to implement this... we only learned CONDA by trying to implement a repository

I couldn't find any info about constraints from searching their documentation.

Can you provide us with an example package we can upload to ProGet (public package is easiest, but private is fine too)? And also, show us what it should look like in the repodata.json file?

Cheers,
Alana

e.rotteveel_1850 · 14 Feb 2023, 16:32

I'm happy to help you improving the conda functionality! Given the conda functionality being very new, It is totally OK that there is room for improvement

I created a small zip file containing a conda channel that can be downloaded here:
https://nextcloud.marin.nl/index.php/s/adkyQ59XojQNEn8

And also a link to the example package:
https://nextcloud.marin.nl/index.php/s/mzmbHjEjszwGYKB

It contains a fully indexed conda channel (with only one package though....), but you can see the repodata.json in the noarch folder, it looks like this:

{
  "info": {
    "subdir": "noarch"
  },
  "packages": {
    "hello-inedo-1.0.0-1.tar.bz2": {
      "build": "1",
      "build_number": 1,
      "constrains": [
        "scipy=1.9.3"
      ],
      "depends": [
        "numpy 1.22.*"
      ],
      "license": "None",
      "md5": "b72976a9e6b36563ecc988b5cc033055",
      "name": "hello-inedo",
      "noarch": "python",
      "sha256": "4b3eee67a7dc4475d8a84e3c5fd5b48924c0e472b59e87167b67a8f673523707",
      "size": 5169,
      "subdir": "noarch",
      "timestamp": 1676390732144,
      "version": "1.0.0"
    }
  },
  "packages.conda": {},
  "removed": [],
  "repodata_version": 1
}

So alongside the depends key, there is the constrains key. Note that there is no second t because its a verb, not a noun. The key defines how the package hello_inedo constrains the user when he installs scipy later on. The info is pulled from the index.json file in a package's info directory (see noarch/hello-inedo-1.0.0-1.tar.bz2) in the same way as the depends key which you are already using in ProGet. For reference, this is the index.json file in the hello-inedo package:

{
  "arch": null,
  "build": "1",
  "build_number": 1,
  "constrains": [
    "scipy=1.9.3"
  ],
  "depends": [
    "numpy 1.22.*"
  ],
  "license": "None",
  "name": "hello-inedo",
  "noarch": "python",
  "platform": null,
  "subdir": "noarch",
  "timestamp": 1676390732144,
  "version": "1.0.0"
}

Furthermore, conda's documentation is not fully covering the more advanced topics unfortunately. Quite some things I had to figure out the hard way....

But what I think you should do, is:

get the constrains key from the index.json that is in a package
add the info in the package's entry in the repodata.json file
In the package page in the UI, you can add them under "Optional dependencies" if ProGet supports that.

Because constrains is just a list of optional dependencies.

I hope this helps!

atripp · 17 Feb 2023, 04:00

Hi @e-rotteveel_1850 ,

Thanks so much, this will help quite a lot and should be easy to follow! I downloaded those package and attached them to our internal tracker.

So basically... it sounds like we should just treat constrains (no t ) like we do depends? And if we can display it in the UI, then we will.

I peeked at the code, and it's a bit more complex than I hoped... mostly because of how we have to index/cache "connector" data as a SQL Lite database. But hopefully not that complex.

Anyway, I'll update once we have an idea of when we can get this field in.

Cheers,
Alana

e.rotteveel_1850 · 17 Feb 2023, 16:42

Thanks! I was checking database entries (in relation to my other topic) and noted that you are probably generation the repodata.json on the fly based on the packages in the database/repository. So I realized it would be more complex than just adding a file.

But indeed, I think constrains should be treated the same way as depends.

I'll watch this thread, looking forward to the next update!

e.rotteveel_1850 · 27 Feb 2023, 08:08

I'm curious; any news on this topic?

atripp · 28 Mar 2023, 06:56

Hi @e-rotteveel_1850 ,

Sorry but I had this mis-categorized internally , so I didn't see the reply.

This actually requires a fair amount of under-the-hood changes, because of the way we maintain an index of the conda packages in a SqlLite database. It's not terrible, but it's also not complex.

This was added as a "nice to have" in PG2023 :)

Cheers,
Alana

e.rotteveel_1850 · 31 Mar 2023, 08:13

Thanks, looking forward to the next release!

With the 'constrains' option working, we can really use ProGet as our main method of distributing (python)packages within our company. It currently works great for most packages, except for so-called meta-packages with optional dependencies.

Meta-packages are nothing more than a list of dependencies. The great thing about them is that they can very effectively "lock" environments meaning that users cannot (either by accident or on purpose) install different versions of packages critical to our workflows. The anaconda package is also a meta-package. If you want to install some package with a different version than stated in the meta-package, you will need to uninstall that meta-package first.

But there are plenty of cases where you don't want users/developers to have to install all 400+ packages of interest each time they create a new conda environment. For that, the "constrains" option comes in handy: we still strongly fixate the versions of certain packages, but they do not have to be installed directly. Only when a user needs such a package later on, he will get the version we want him to have. This is great when someone starts a new project: initially they only need pandas and/or numpy, but later they want to try some machine learning and need scikit-learn or pytorch.

Good luck with the development of version 2023, I'll keep an eye on new releases.

atripp · 31 Mar 2023, 12:22

@e-rotteveel_1850 thanks for explaining that, that's great to know!

Sometimes it's almost impossible to learn how these feed/package types are actually used, especially since we don't develop in those languages and really just focus mostly on API reverse-engineering ;)

FYI We are targeting late April for 2023.0 release