Welcome to the Inedo Forums! Check out the Forums Guide for help getting started.
If you are experiencing any issues with the forum software, please visit the Contact Form on our website and let us know!
Conda feed: channeldata.json with non-ASCII (or non-ANSI) characters cause problems with Conda
-
In the channeldata.json file (which can be found at http://PROGET.BASE.URL/conda/channel/channeldata.json), packages with non ANSI or non ASCII characters lead to parsing problems in conda-build.
I created a package that allows to reproduce the problem, and show the differences with Conda channels produced via
conda build
. The example package can be downloaded here: https://nextcloud.marin.nl/index.php/s/QCkmijoD5oiJYx9If I created (using the latest ProGet version) a new conda channel, then upload this package, the
channeldata.json
looks like this:{ "channeldata_version": 1, "packages": { "hello-inedo-triple": { "activate.d": false, "binary_prefix": false, "deactivate.d": false, "description": "Hello this is an “example” description", "dev_url": null, "doc_source_url": null, "doc_url": null, "home": "https://forums.inedo.com/topic/3696/conda-channels-should-also-add-the-constrains-from-a-package-s-index-file-to-repodata-json", "icon_hash": null, "icon_url": null, "identifiers": null, "keywords": null, "license": "None", "post_link": false, "pre_link": false, "pre_unlink": false, "recipe_origin": null, "run_exports": {}, "source_git_url": null, "source_url": null, "subdirs": [ "noarch" ], "summary": "Example package for Inedo/ProGet developers to improve their conda channel functionality", "tags": null, "text_prefix": false, "timestamp": 1677572777750, "version": "1.0.0" } }, "subdirs": [ "noarch" ] }
But if I create a channel with conda-build using this example package only, the JSON file looks like this:
{ "channeldata_version": 1, "packages": { "hello-inedo-triple": { "activate.d": false, "binary_prefix": false, "deactivate.d": false, "description": "Hello this is an \u201cexample\u201d description", "dev_url": null, "doc_source_url": null, "doc_url": null, "home": "https://forums.inedo.com/topic/3696/conda-channels-should-also-add-the-constrains-from-a-package-s-index-file-to-repodata-json", "icon_hash": null, "icon_url": null, "identifiers": null, "keywords": null, "license": "None", "post_link": false, "pre_link": false, "pre_unlink": false, "recipe_origin": null, "run_exports": {}, "source_git_url": null, "source_url": null, "subdirs": [ "noarch" ], "summary": "Example package for Inedo/ProGet developers to improve their conda channel functionality", "tags": null, "text_prefix": false, "timestamp": 1677572777, "version": "1.0.0" } }, "subdirs": [ "noarch" ] }
You can see that in the channel created by conda-build, the "special" quotes in the "description" key are encoded characters, whereas the ProGet feed just puts the characters themselves in the JSON output.
Both are fine and proper JSON, but the way conda handles things means that some special characters should be encoded before putting them in the
channeldata.json
file.Actually, I think it is ProGet that "does a bit too much" here. If you open the example package and look at
info/about.json
- which is the file where the description data is pulled from I guess - you see that in the package, the special quotes are also encoded. ProGet however decodes them before generatingchanneldata.json
, leading to parsing problems in Conda itself.
-
I went through Conda's code to see how it handles the
channeldata.json
file. Basically, you can use this to reproduce the problem:import requests import json r = requests.get('http://PROGET.BASE.URL/conda/conda_channel/channeldata.json') with open('some_file', 'wb') as f: f.write(r.content) with open('some_file') as f2: A = f2.read()
Furthermore, looking at Anaconda's main repository and the
channeldata.json
in there:
https://repo.anaconda.com/pkgs/main/channeldata.json
You will also find quite some "encoded" characters (search for "u201c" for instance)I think conda/conda-build normally make sure the
channeldata.json
file is as plain as possible. That should at least apply to thedescription
andsummary
keys, but maybe to the whole file in general. It probably also applies to therepodata.json
file in the sub-directories such asnoarch
,win-64
andlinux-64
.
-
Hi @e-rotteveel_1850 ,
Thanks for all the details here! Very helpful, especially since we know very little about Conda.
In our code, we have a
WriteChannelData
and aWriteRepoDataAsync
method, which write out these files on demand using theNewtonsoft.Json
library for this.So, I just specified a StringEscapeHandling of
EscapeNonAscii
, which will escape all properties. I don't think that will be a problem.The change is PG-2295, and it will ship in next maintenance release (Friday, Mar 10). If you'd like to try in a prerelease, just let me know and I can promote our CI-build so you can use it sooner.
Cheers,
Steve
-
Thanks for the quick reply! I'll wait for the new release and give it a try then. No hurry for this, I just noted it happening when uploading some package to it. For now, I can just prevent adding such characters to my package description/summary.
-
I have justed tested some uploads with 2022.24 and it seems to work fine now, thanks!