Welcome to the Inedo Forums! Check out the Forums Guide for help getting started.

If you are experiencing any issues with the forum software, please visit the Contact Form on our website and let us know!

Conda feed: channeldata.json with non-ASCII (or non-ANSI) characters cause problems with Conda



  • In the channeldata.json file (which can be found at http://PROGET.BASE.URL/conda/channel/channeldata.json), packages with non ANSI or non ASCII characters lead to parsing problems in conda-build.

    I created a package that allows to reproduce the problem, and show the differences with Conda channels produced via conda build. The example package can be downloaded here: https://nextcloud.marin.nl/index.php/s/QCkmijoD5oiJYx9

    If I created (using the latest ProGet version) a new conda channel, then upload this package, the channeldata.json looks like this:

    {
      "channeldata_version": 1,
      "packages": {
        "hello-inedo-triple": {
          "activate.d": false,
          "binary_prefix": false,
          "deactivate.d": false,
          "description": "Hello this is an “example” description",
          "dev_url": null,
          "doc_source_url": null,
          "doc_url": null,
          "home": "https://forums.inedo.com/topic/3696/conda-channels-should-also-add-the-constrains-from-a-package-s-index-file-to-repodata-json",
          "icon_hash": null,
          "icon_url": null,
          "identifiers": null,
          "keywords": null,
          "license": "None",
          "post_link": false,
          "pre_link": false,
          "pre_unlink": false,
          "recipe_origin": null,
          "run_exports": {},
          "source_git_url": null,
          "source_url": null,
          "subdirs": [
            "noarch"
          ],
          "summary": "Example package for Inedo/ProGet developers to improve their conda channel functionality",
          "tags": null,
          "text_prefix": false,
          "timestamp": 1677572777750,
          "version": "1.0.0"
        }
      },
      "subdirs": [
        "noarch"
      ]
    }
    

    But if I create a channel with conda-build using this example package only, the JSON file looks like this:

    {
      "channeldata_version": 1,
      "packages": {
        "hello-inedo-triple": {
          "activate.d": false,
          "binary_prefix": false,
          "deactivate.d": false,
          "description": "Hello this is an \u201cexample\u201d description",
          "dev_url": null,
          "doc_source_url": null,
          "doc_url": null,
          "home": "https://forums.inedo.com/topic/3696/conda-channels-should-also-add-the-constrains-from-a-package-s-index-file-to-repodata-json",
          "icon_hash": null,
          "icon_url": null,
          "identifiers": null,
          "keywords": null,
          "license": "None",
          "post_link": false,
          "pre_link": false,
          "pre_unlink": false,
          "recipe_origin": null,
          "run_exports": {},
          "source_git_url": null,
          "source_url": null,
          "subdirs": [
            "noarch"
          ],
          "summary": "Example package for Inedo/ProGet developers to improve their conda channel functionality",
          "tags": null,
          "text_prefix": false,
          "timestamp": 1677572777,
          "version": "1.0.0"
        }
      },
      "subdirs": [
        "noarch"
      ]
    }
    

    You can see that in the channel created by conda-build, the "special" quotes in the "description" key are encoded characters, whereas the ProGet feed just puts the characters themselves in the JSON output.

    Both are fine and proper JSON, but the way conda handles things means that some special characters should be encoded before putting them in the channeldata.json file.

    Actually, I think it is ProGet that "does a bit too much" here. If you open the example package and look at info/about.json - which is the file where the description data is pulled from I guess - you see that in the package, the special quotes are also encoded. ProGet however decodes them before generating channeldata.json, leading to parsing problems in Conda itself.



  • I went through Conda's code to see how it handles the channeldata.json file. Basically, you can use this to reproduce the problem:

    import requests
    import json
    
    r = requests.get('http://PROGET.BASE.URL/conda/conda_channel/channeldata.json')
    
    with open('some_file', 'wb') as f:
        f.write(r.content)
    
    with open('some_file') as f2:
        A = f2.read()
    

    Furthermore, looking at Anaconda's main repository and the channeldata.json in there:
    https://repo.anaconda.com/pkgs/main/channeldata.json
    You will also find quite some "encoded" characters (search for "u201c" for instance)

    I think conda/conda-build normally make sure the channeldata.json file is as plain as possible. That should at least apply to the description and summary keys, but maybe to the whole file in general. It probably also applies to the repodata.json file in the sub-directories such as noarch, win-64 and linux-64.


  • inedo-engineer

    Hi @e-rotteveel_1850 ,

    Thanks for all the details here! Very helpful, especially since we know very little about Conda.

    In our code, we have a WriteChannelData and a WriteRepoDataAsync method, which write out these files on demand using the Newtonsoft.Json library for this.

    So, I just specified a StringEscapeHandling of EscapeNonAscii, which will escape all properties. I don't think that will be a problem.

    The change is PG-2295, and it will ship in next maintenance release (Friday, Mar 10). If you'd like to try in a prerelease, just let me know and I can promote our CI-build so you can use it sooner.

    Cheers,
    Steve



  • Thanks for the quick reply! I'll wait for the new release and give it a try then. No hurry for this, I just noted it happening when uploading some package to it. For now, I can just prevent adding such characters to my package description/summary.



  • I have justed tested some uploads with 2022.24 and it seems to work fine now, thanks!


Log in to reply
 

Inedo Website HomeSupport HomeCode of ConductForums GuideDocumentation