Elitism, and the frustrating necessity of PYPI

January 24, 2011 at 12:08 PM | Tags: packaging, python, pypi, elitism

This post was import from an earlier version of this blog. Original here.

I recently started using a really clever, helpful module I found on PYPI (all names redacted to protect the guilty). I found a small bug in it, so I emailed the author. He wrote back and told me the bug had been fixed, and new versions were available on launchpad. I assumed this meant in the development tip that hadn't been released yet. Instead, I was suprised to find that what was on PYPI was three major releases out of date. So I wrote back to him to ask him to push a new version - I even included the exact setup.py commands to run. This was the reply I got:

Yes I know, I should do this, but I hate such complex and silly technologues as easy_install and eggs and everything that transforms Python into a Java-like ugly piece of "programming-tool-for-the-dummy-masses" ;-)

I'm just stunned. Programmers certainly have a reputation for arrogance, but to see it so clearly on display from the author of a GPL'd module is just shocking. Why do we open source our code, let alone GPL it if not to have as many people use it as possible? Try as I might, I cannot get my ahead around this way of thinking.

I'm no fan of the current state of Python packaging, as I've written before. I don't like setuptools much either. Heck, I once gave a talk titled Using Setuptools: Your Head vs. The Wall. PYPI is better, but not much. Text search is barely functional (substring? really?), the trove classifiers are useless, it's cluttered with abandonware, and the lack of signed packages is a security disaster waiting to happen. Yes, I'm whining, and no, I'm not going to do anything about it - I can only fight so many battles at once. But for better or for worse, it's what we've got.

Therefore, in the spirit of Docs or it Doesn't Exist:

If it's not on PYPI, your package doesn't exist.

With the advent automatic dependency installation, no one is going to hunt around the web looking for the latest version of your package. Between various projects, I have tens or hundreds of third party modules in use - tracking all of those down by hand like the bad old days simply isn't feasible. PYPI and easy_install have made me vastly more productive - allowing me to create new projects quickly and automate deployments. Hardly the kind of thing done by the "dummy-masses" - and it all works only because developers post their packages on PYPI. Yes, the tools suck sometimes - but your package is not special enough to make me give them up.

Since a reminder is apparently needed, here's how you get your code up. This works with straight distutils (no setuptools needed) and requires a PYPI account:

python setup.py sdist
python setup.py register
python setup.py upload

PYPI uploads: for $DIETY's sake, just do it.


These comments were imported from an earlier version of this blog.

brandon_rhodes 2011/01/24 18:45:44 -0800

In the late 1990s I was a volunteer maintainer for RPMs in the public "contrib" repository, and made something like a dozen pieces of software available whose authors would never have had time to sit down and learn the mysteries of RPM development. (Including, I now see in an old log, Python 1.5 itself!) While PyPI distributions are much easier to create, they are nevertheless both a hurdle and quite an investment for many developers — and I myself have stepped in before to help the NTLK project get on PyPI when they lacked the cycles and expertise to do so themselves. Maybe we need to take the same approach today: to have volunteer maintainers step forward, subscribe to the github and bitbucket repositories of software they like using, and build PyPI distributions of the software when it is released. I will bet there would even be an economy of scale: that it would be more efficient for a Python packaging expert with a few free hours, and some good scripts for enacting the checkout / test / package / virtualenv-install / test cycle, to package a dozen software components than for each of those authors to wade into the Python packaging ecosystem on their own.

None 2011/01/24 19:35:42 -0800

There is also mkrelease (http://pypi.python.org/pypi/jarn.mkrelease/3.0.9) and other packages like it (but I prefer this one) that make it almost impossible to not release your packages to PyPI. No excuses! :-)

david 2011/01/25 01:59:22 -0800

Registering on pypi has a cost, which can be pretty high: you will receieve bug reports because your software does not work with the so-called easy_install (whose real name should be magic-sometimes-works-install).

I have not registered several of my packages for exactly this reason (and they are still used).

Antonio Cavallo 2011/01/25 04:17:41 -0800

The biggest point against setuptools, easy_install and all the other "magic" dependency trackers is that they fail entierly to tak into account what is already made.

Rpm (and apt/yum/zypper deps trackers) comes after a nearly decade of (painful) testing: tapping into these infrastructures is mandatory. As sysadmin I'm not going to trust easy_install to do this for me. I have duty to protect the company I work for and be sure I can reproduce the step I've taken to develop anything.

About the setup.py it should not grow to be installer/deployer/configuration/meat-cooker script: it focus on just a single simple task: build the package. That is an important lesson learnt in the past: for example the newer rpms do not contain "installer" scripts anymore for that exact reason: they are installers not configuration tools (that's another scope).

I'm currently implementing a reasonable infrastructure to deal with that: it can build modules automatically from source code against a pyton interpreter: http://pyvm.sf.net

Regards

Robert-Reinder Nederhoed 2011/01/25 04:37:53 -0800

I second your position! I'm a fanatic Python user, but not very handy with install procedures and Linux environments. I'm very glad to be able to install Django (for example) with:

python bootstrap.py
bin/buildout install
Voila!

For those who seem to have "deployment genes" it is hard to understand others might not be as capable as them.

Rene Dudfield 2011/01/25 06:44:36 -0800

I definitely agree that putting packages on pypi is useful. But I understand why people don't do it.

easy_install does cause maintenace burden for some authors. So I understand this authors position. For one project I do not upload to pypi since there are a bunch of issues for that project and easy_install.

The same reasoning could be applied to ubuntu packages, windows+mac installers, ... freebsd ports, macports, android ports, symbian ports... the list goes on. How dare the author not package their code up for platform X. ;)

I asked a room full of python developers "who puts their packages on pypi", and two or three put their hands up. "who puts them on github, googlecode, or on their own webpage?" Most people raised their hands. This is why I think there are 10x more packages for python than what is listed on pypi. It also says something about how difficult packaging for python is.

loveencounterflow 2011/01/25 08:04:09 -0800

yeah, pypi sucks pretty badly. it really starts with the url: that could be

http://pypi.python.org

or (much better)

http://python.org/pypi

but it is http://pypi.python.org/pypi

which looks stoopid. you get shown a page with the most recent updates listed. if you want to see all packages on one page, you are offered a link that says http://pypi.python.org/pypi?:action=index, and when you click it, the browser address line will read

http://pypi.python.org/pypi?%3Aaction=index.

quaint, reminds of those mid-nineties. btw you get the same listing if you manually go to http://pypi.python.org/pypi/, with a trailing slash. where are web programmers when you need them? obviously not in the search engine department. my way to discover packages is either using google with "site:pypi.python.org", or just do browser on-page search.

recently, a strange spam fad has cropped up on pypi: if you are receiving the pypi twitter feed, you will have noticed that for a while now there were packages claiming to be 'simple printer of nested lists', sometimes with several daily updates; today i see about 50 entries listed but the complete list must be much longer. it almost looks like a programming class is using pypi as a training area. i tried to downvote them and add MHOs to the comments, but got disappointed that i have to re-login to the site each day i visit it, and that the login (inmplemented using the defective bare-bones HTTP Auth) always leads me away from the package page i wanted to leave my rating on. if nobody takes action, python will go down in history as "that language that could print out lists".

it's all minor quibbles, but it does add up.

another problem i see is that a whopping 194 packages have absolutely nothing to say about themselves; there are entries like "vop 1.2 UNKNOWN" and "m2h 1.0 UNKNOWN". if the authors do not know, how am i to guess what "vop" can do for me? at least in this case there is a link to the package homepage.

Eric Larson 2011/01/25 09:04:22 -0800

I think that stating the author is arrogant is a little out of line. Any production system using pypi as a package repository should really consider the implications. You might automatically upgrade to a version that won't work. If pypi is down during a deployment what happens? What happens when an author doesn't upload new stable releases or worse yet, uploads unstable releases!

There is a real shift between development and production that can be difficult to see when using Python thanks to its dynamic nature. If you are using someone else's library in your application in production, then I'd say it is your responsibility to have a workflow for monitoring progress on that library, automating pulling in changes as you need them and packaging them up for your own deployment.

It would be nice to have a pypi library that helps to support creating this workflow that also plays nice with setuptools/distutils. But either way, a lot of the problems you see with pypi are arguably problems that you should solve yourself for your own production requirements.

toutpt 2011/01/31 07:07:06 -0800

This guy can't clame to be developer, he is not able to build a package, so he can't build great software. Pypi is the place to host your releases.

regebro 2011/02/01 02:28:23 -0800

The annoying thing wil packaging discussions is all the unclear criticism against packaging tools, often in the line of "It should be better". Better how? "It shoudl do this!" Well, its does. "No it doesn't!" Yeah, it DOES. It "Ignores all experience from X" No it doesn't. "You can do Y and that's a bad thing!" So don't do it then.

Example: "You have to support easy_install if you upload on PyPI". No you don't, and supporting easy_install/pip is trivial in most cases, and you should. But you don't *have* to.

Or even better: "[The url] could be

http://pypi.python.org

or (much better)

http://python.org/pypi"

Well, funnily enough, those both work...

I can't, in the rants above, see one single constructive valid criticism of PyPI. Not one. I don't know why packaging issues make people into raging irrational ranters, but that's what happens, apparently.

anonymous 2011/02/03 12:25:22 -0800

If I decide to open source my software, I don't really care if you use it or not. I built it for me and decided to share it with others. If they can use it, great! If not, who cares? It's purely my choice to do so. It's purely my choice how do distribute it, just as it is my choice how to license it and how to document it.

Get your head around the fact that the original author owes you nothing and if he doesn't want to take the time to do things your way, you have to live with it. Choose another project... write your own code... but don't bitch at somebody because they shared their work and decided not to go as far as you wanted them to.

Being too busy or uninterested in packaging things a certain way does not reflect arrogance. But criticizing the limits of someone's generosity without considering their time and effort is pretty selfish.

anonymous 2011/02/03 12:44:27 -0800

And BTW, I do agree with the "Docs or it doesn't exist" blog you linked. I really wish projects were better documented. It's obvious any project will benefit from good docs.

Distributing python packages is a different situation. It's NOT obvious which distribution method is best. There are too many ways to do it and not everybody is on the same page. You may think PyPi is the standard, but I don't think it is. Until everyone agrees on a standard solution, we're all just going to have to deal with the fact that some people love eggs and some people hate them, etc.

The real problem with your anonymous open source author is his PyPi project is out of date. He should remove it or add a note telling people that it's out of date along with a pointer to current versions.

anonymous 2011/02/03 12:45:06 -0800

Wow. You're unbelievable. Can't take criticism?

anonymous 2011/02/03 12:48:41 -0800

Heh.. my mistake.. It looked like you removed my first comment within minutes of posting it. Sorry about that.