Hacker News new | past | comments | ask | show | jobs | submit login
Schema Migrations for Django (kickstarter.com)
299 points by pykello on March 22, 2013 | hide | past | favorite | 99 comments



Just threw $100 your way.

I'm a big believer in this type of model and having you be a Django core contributor and South developer makes it a no brainer. Paying for quality open source is always a win in my book.

Godspeed!


I am, on the other hand, not a fan of this investment model. Indeed I would go so far as to say I don't think it works nor does it make much sense. Especially in the open source world.

I've been using django since 2006. This has been a suggestion since then. Why hasn't it been implemented if it is so necessary? It doesn't require a massive core shift, it doesn't even require that much work it seems? So why hasn't it been done with so many eyes on so well documented a platform on so accessible a language with so many commercial deployments? If no one has scratched this itch then paying for someone to scratch an itch you may think you are go

Will the person be deprived of their normal source of income for exactly the time period required to do the work? How is that justified as a cost? Maybe they are a contractor who will take the time out. If that is the case how can there be expandable requirements (longer goal objectives for more money? It could be this whole thing sits very well with their day to day job, but in the western world that is unlikely. I'm sure all intentions are good but it also strikes me as a bit questionable to do these things when lots of other people have committed much more centrally to the project with little immediate remuneration.


So the situation here is indeed interesting - I understand your concern. I've asked all the other core contributors to Django their opinions on this and there's been resounding support from them, so that's not a concern.

I also have a day a week free for this, so there's no concern of this robbing income - that day a week was previously used for learning to fly, and now I'm done with that it needs to have something that pays in order to keep me going. If the choice for that is freelance or open source work, and they pay closely, you can bet I'm going for the open source work.


Yea I appreciate the concerns I have may be total nonsense too! I see from your point of view its a great way to work on something cool/interesting/career boosting and also be payed for it.

Heck I've got a list of about 13 projects that are roughly affiliated with my day job but because of whom i work for and the time I have free being used up for other much more frivolous things I havn't had a chance in a year.

Perhaps it is just jealousy on my part :)


I'm a big fan of this investment model. I don't really contribute to open source, my contribution is mostly bug reports and being willing to help the real contributors track down issues.

Stuff like this is how I contribute. I, either personally or via companies I've worked for, have done this kind of thing a lot: approach the developer and ask for our bug or feature to get pushed up the priority queue. Sometimes it's a small tip but one company gave out a $10K contract for a bunch of work.


  I don't really contribute to open source, my contribution is mostly bug reports and being willing to help the real contributors track down issues.
Jeremy Ashkenas on here a few days ago would have said such contribution is a vital and useful one - this is a game of inches - and clearing down bug lists, and beefing up the real ones is a big one


I don't understand the concern. It's a real problem and if someone wanted to do it for free, they could have and would have. Here is a guy who offers to do it if he get's paid a minimum amount. Fine. No moral issues.


> Here is a guy who offers to do it if he get's paid a minimum amount.

I'm sure Andrew would work on this even if the kickstarter had failed - he built South without one after all :)

But having a funded project helps in many ways. It frees him from the distraction of contract work, proves that a lot of people really care about the issue, and finally is a thank-you-donation for the work on Django in the past and the future.


It's a no-brainer if only as a thank-you for something I've used and certainly appreciated.


[deleted]


The issue is that it arguably violates the spirit of open source. Open source software is meant to be free and available to everyone.

The spirit of Open Source is free like speech, not like beer. Open Source, while often given away, has always been about empowering people to be able to hack on internals, not simply avoid paying money for things. I've paid real money for a lot of open source software for the decade and a half and done so happily.


Yes, thank you for reminding us all of that.

Kids these days weren't around during the /. debates on this very topic.


Money and open source mix just fine.

The practical reason that open source has exploded is that a large number of interested parties can contribute towards a project that benefits them all, avoiding duplication of effort.

This is just the next logical step. Instead of contributing one hour of my time, I can spend it on my day job and pay for one hour of Andrew's time.

In practice, this is exactly the same as the old model, just with higher efficiency - as a Django outsider (but user), one hour of Andrew's time is worth the first hundred of mine.

If you don't want to contribute financially, you have no obligation to; just in the same way that you can use Django and contribute no code. It also doesn't stop you from helping in other ways - I'm sure Andrew would be thrilled to have help, so long as you spend those countless hours getting up to speed :)


> it arguably violates the spirit of open source. Open source software is meant to be free and available to everyone.

Open source means open source. There's no implicit requirement that the developer must have received no monetary compensation for developing the source.

Besides, a whole lot of open-source software is written by developers who, while developing it, are getting paid by a parent company.

Consider Apple, who has contributed open-source software as an outgrowth of their work on OS X. Do you think they demanded that their developers work for free, so as to avoid tainting the spirit of the open source movement and complicating the ethical calculus? Of course not.


I see this kickstarter model as similar to large companies that contribute employee time to work on open source projects - this just decentralizes it, focuses the development more on the project, and establishes community demand prior to rolling out the feature.

If this continues do you risk drying up other non-compensated commits or risk contributors waiting for funding before pushing code? You might see a small decrease, but I'd be willing to bet the same dynamics that have kept the open source community moving forward will continue - you just might see more targeted, rapid progress in areas people are willing to fund and a small cottage industry emerge that makes money contributing to projects.


Can anyone who's ever worked on a FOSS hobby project honestly say that they wouldn't prioritize differently, if anyone offered them money to work on it?


So you are essentially saying this guy should do it for free.

"Will the person be deprived of their normal source of income for exactly the time period required to do the work?"

Does it matter? It will most likely be on his free time, which for me is many times more valuable than my normal income.

This is why developing open source is going down a dangerous path: When you work for free once, people expect you to work for free always.


Thanks a lot - your contribution is much appreciated! I hope this helps set a trend.


I hope so too.

Maybe this is the way we'll get good, solid Python3 support for Django, quickly. I'll be more than happy to fund (in part) efforts for getting work done on other Django plugins so they work with Python3. The Django book can do with a little concentrated focus too.


Worth mentioning that if someone wants to ask me (I'm running this) questions about this I'll be checking here!


What are the problems with South, as you point to? I think those are a big part of the selling point for your project. :)

Additionally, could you explain how the project will not be obviated by Django's plans to implement built-in schema migration in, I think, 1.6 or 1.7?

You taken on one of the most important aspects of managing Django services, so I'm genuinely interested, but I just want to know what the thinking is behind this.


I try to address the key points in the Kickstarter, but essentially South is too fragile (it relies on exactly linear histories) and too verbose (bakes the entire ORM state into every migration) - the main point is to fix those.

It's also worth pointing out that this IS the built-in schema migration that's going in in 1.6 or 1.7.


You should reiterate in this thread that you are the originator of South as well so if anyone knows its limitations it should be you :)


Alright, alright, have my money then. :)


Are you going to do something about fields needing dependencies? So say you have a django-tinymce field in one of your migrations but remove it later, South will currently still depend on django-tinymce, even though in the database it's just a text field.


> how the project will not be obviated by Django's plans to implement built-in schema migration in, I think, 1.6 or 1.7?

uhmm, it's exactly this project :)


Additionally, could you explain how the project will not be obviated by Django's plans to implement built-in schema migration in, I think, 1.6 or 1.7?

Maybe he added it in the last 7 minutes, but this is answered in the Kickstarter description.


I will second this. What's wrong with South, currently?

By the way, this is the first Kickstarter project I've considered funding.


South can often fail for reasons that are surprisingly silly, at least in my experience. Though it's still pretty useful compared to having nothing.


Is that with MySQL? I've used it for scores of projects, always with Postgres, and it's never failed me, even if it's been a bit slow when there are lots of rows in play (which isn't really South's fault)


Yeah, with MySQL. You've probably got a pretty good point there. If I were to start a site myself one of the first things I'd do is never use MySQL, though it does improve ever so slowly.

In the latest version(beta?) you can actually have a column with 'default current_timestamp' and a column with 'on update current_timestamp' in the same table after years of only being able to have 1 column per table auto that way.

I don't remember the details, but I think I had to fill a column with zeros all the way down just to drop the whole column in the migration anyway or something along those lines.


Thanks!

You say "Schema migration with Django has had a long and complex history". Can you explain, even briefly, why migrations haven't been a part of Django's core so far? I was under the impression that it was a deliberate design choice. What made you change your mind now?


The basic stumbling block as I understood it was that there were no consensus on how to build a good migrations system. Instead of just picking one system and bundling it into Django the decision was to do iteration in third party libraries. We got dmigrations, django-evolution, and south, and over time south won the race...


Does this mean anything will happen with Oracle?

I know that I'm perhaps the odd man out, but South just doesn't work reliably on Oracle, and I really, really miss it.

Alternately, if I were to donate a running RDS instance with Oracle on it, would that help in any way? I seem to remember an issue (that affects me) reported, and effectively disregarded with a message similar to, but not necessarily that you don't have Oracle, so it wasn't likely to get fixed.


I'm in exactly the same boat. "Oracle support is in beta". The kickstarter mentions Oracle briefly, but only in a "if there's money and time" kind of way. I don't think a feature like this should be considered complete until all of the officially supported backends have support.

The problem really seems to be the lack of access to an Oracle instance. Unfortunately my company isn't in a position to make available an instance, but I'm sure we could contribute an on-going dollar amount towards one.


As of now you have collected 10k GBP, what are you going to do with the money with the 3k+ in unplanned additional funds? Are you going to adjust you hourly rate (not that I mind your initial "quote" wasn't that high to begin with), donate to charity, deliver more code/value, buy the Django core devs a beer? Genuinely curious. And congrats on making your goal, I love this model and I love that you had the courage to make this effort. Succeeding with this campaign publicly shows that contributing to open source software has to be an unpaid endeavor (yes I know that there are employees who get paid to contribute to open software). I wouldn't mind if more talented devs sought out crowdfunding to enhance OSS.


Awesome. I will make a donation if I can get approval.

How will the new rebase command be different from truncating the migration table and regenerating initial migrations?


You won't need to tell other machines that there's a new migration set - they'll just realise and migrate appropriately (new machines on the new initial, old ones on the left-over old ones until they're all done and you can remove them).


I didn't think of that. And that would be nice. Thanks!


Are non-django field types like (postgresql hstore) going to have some love as part of these migrations? If not, a proper and standardized way to write plugins to extend it would be awesome in order to be able to migrate "nonstandard" field types.


Part of this is coming up with a new spec for Django field types to allow this stuff to work better, so that'll hopefully do what you want.


Is there a reason to include this directly in Django rather than leaving it as a pluggable app? Updates/bug-fixes/features are way easier to push when it's not tied into the official update cycle, letting you be a lot more agile.


Congratulations - you've just passed £3500!



I know. So happy.


I haven't used the Django ORM in over a year and I'm pretty happy with Flask + SQLAlchemy for the moment. That said, I instantly backed the campaign when I saw it. South has been a tremendously helpful tool I used in virtually all my Django projects since its inception and I'm glad I can finally give something back.


Hi Andrew, congrats on the campaign and reaching your goal so quickly! And thank you for South! I've been using South for some years and it would be impossible to imagine working on a Django project wihout it. Two quick questions:

- Are there any South "Good practices" books which you would recommend? South's documentation is great, but it might be interesting to look at some real-world project examples to learn a few new tricks.

- I'm interested in particular in data migration development practices. How do other developers go about developing and testing them? I've discovered that by raising an Exception during the data migration, South will rollback the specific migration, allowing me to iteratively test/debug the migration until it seems bug-free and ready. Are there any other ways to do it (would running the data migration with the "fake" option allow me full read access to the database, but would rollback / skip any writes?).

Thank you for your time!


There was an interesting chat-log[1] someone posted on /r/Django on reddit talking about dealing with South migration merging/conflicts, and other issues that crop up in a multi-person project.

I'm curious about the best practices for data-migrations as well; I've seen a few places[2][3] suggest that wrapping your fixture loading "./manage.py loaddata <fixture>" in datamigrations is better than leaving it up to initial_data.{yaml,json}

Edit: here's the one I was thinking of: South common pitfalls: http://andrewingram.net/2012/dec/common-pitfalls-django-sout...

[1] https://gist.github.com/sjl/4438002

[2] http://djangosnippets.org/snippets/2897/

[3] http://stackoverflow.com/questions/5472925/django-loading-da...


Hi Greg,

There's no books or anything, alas, nor is there a very good corpus of material on this field - there's a few good talks out there, but I've not seen a great deal of written material.

Perhaps I need to write a book as well? Not sure I need another Kickstarter, though!


I would checkout Scala-Migrations for guidance. https://code.google.com/p/scala-migrations/


When I saw this at the top of HN I was thinking "Damn it! The South guy already has a plan for this!" ... then I see it IS the South guy asking for funding to implement his plan!

Done!

Oh, and thanks for the email support now and then :-)


I don't know whether you'll find it useful, but when I wrote a migrations add-on for Rails, I spent a lot of time learning how to query Postgres and MySQL for the status of current foreign keys, indexes, and "check" constraints, including special extensions like "where" clauses on indexes. If you need to do something similar, perhaps you can borrow the queries from here:

    https://github.com/pjungwir/db_leftovers
If that seems useful and you have any questions, please let me know!


I already have written code for this several times in the past for the basic constraint types, but not the where clauses - I'll check it out. Thanks!


Having South take that leap into Django core will be awesome.

I'll bet one natural result of it entering core will be better shared understanding of best practices & foibles of different backends.

Managing migrations on increasingly large tables is stressful for me primarily due to my lack of knowledge, moreso than fragility in any piece of the puzzle.

In any case, I've thrown in my pledge - best of luck!

(If anyone has any good recommended reading for DBA-type knowledge in a devops world, I'd love an Amazon link you think is worthwhile.)


> Managing migrations on increasingly large tables is stressful for me

large like lots of fields or large like millions of rows? if the latter, than migrations are a PITA and you'll often find inserting a select into a recreated new table is faster than altering the existing one. but that's probably not something a migration tool like this should/could support without lots of your work anyway.


Rows, not columns. But more so than really large tables, I find MySQL will tip over even on smaller tables, not very hot tables, if I want to do the work without downtime.

(It may well just be my environment, but it is fairly standard RDS on AWS.)


It's a limitation of MySQL unfortunately. MySQL does schema changes by creating a new table with the new schema and copying the old data into it and doing an atomic switch over. The table gets locked for obvious reasons.

Percona has pt-online-schema-change [1] but I have no experience with it or if it works on RDS.

There are other solutions. The most simple is doing the table rewrite yourself (but risking losing data). The more complex is to have a buffer/queue that can hold requests until the migration completes but this only works in certain specialized situations.

I'd love to find a better solution to migrations in MySQL. I've had migrations take 15 minutes on a relatively small table (2M rows). I'm a little scared to try pt-online-schema-change for fear of corrupting the db.

I would switch to PostgreSQL (nothing but love for that database from me) but the ORM I'm using has issues with identifier quoting and I can't use my "user" table with it.

1: http://www.percona.com/doc/percona-toolkit/2.1/pt-online-sch...


We were using PGSql at the beginning of this project, but I was lured by the automatic backups / restore capabilities that RDS provides and I migrated.

One of my co-workers used to pound on PGSql pretty hard in a very high traffic environment and has nothing but good things to say, but I'm not crazy about the idea of having to learn all of PGSql's foibles like we have MySQL.

My impression is that MySQL 5.6.x (we're on 5.5.x) should help with the schema migration pain, but I don't have any hard evidence for that claim.

My hope is that MYSQL 5.6.x will remove a lot of the pain, but I don't have a


I was hesitant to choose Python 3.3 for a new project with Django 1.5 because of compatibility issues with add-on libraries. I've been able to get South working from the tip of the hg repo, which was a big one for me.

Thanks for working on this, it makes me much more comfortable with my Python 3.3 decision knowing migrations will have first class support in Django. I just sent in my contribution.


I'm really glad to see someone tackling this; even better that it's someone with plenty of experience and insight. I wish more people would give me opportunities to help fund big features like this in open source.


I found myself talking to a co-worker about this being a problem last week.

We found that having this functionality as part of the core is something that makes a framework feel significantly more agile. Virtually nobody gets their schema exactly right the first time. Asking developers to to learn about and use 3rd party software (if it even exists) adds a very real friction to migrations. Altering your data model from time to time is a natural operation that occurs during application development and should not seem out of place.

I'm very excited for this, thanks Andrew!


Andrew--

We use Django and South in our projects, and really appreciate the work you have done. I just contributed, and am excited about this project.

My one suggestion/request: Please make it an emphasis to create awesome documentation. I am familiar with South now at this point but often find myself wishing more use cases and more examples were documented. I can usually "figure it out," but great documentation saves everyone time. This seems like a great project, so make the documentation great too!


EDIT: By 'not possible' I mean it's not possible to delegate responsibility for it to a tool that derives what the differences in schema are.

Schema migrations are not possible in general.

Yes, you can diff two schemas and generate statements to turn one into the other. Adding columns works fine this way. Removing columns works fine.

Do you really need a library to add and remove columns? Not really. But rollback, someone will say. Schema migration libraries let you undo things. Nope. You can't rollback a drop column once it is committed, that data is gone.

What happens when you run into issues in production that you didn't have in your test/development environment?

Basically, automated schema migrations don't work. However, I guess if you haven't been through it you'll just have to learn the hard way. :)

SQL gives you a very flexible environment. You never have to break existing code to push out new features, if you are just a tiny bit careful.


Well, do you EVER actually need a library? No, mostly you can always do stuff manually. But South makes a lot of stuff easier/less brittle, especially when you have a) multiple servers, b) multiple developers, c) multiple environments, and/or d) volatile data structures. It is also useful to see what database changes a pull request introduces. And don't get me started on data migration, which is TRULY easier in South, if you are working with multiple branches/developers and/or have schema changes as well.

South certainly has it's warts, but that's why I'm happy to contribute to Andrew's project.

Oh, and rollback: Depending on what the data is, it might actually be possible to rollback. It might be an aggregate column of some sort, or moving of data from one table another. But it's mostly not superuseful, and it's not a big part of the awesomeness of South.


Off-topic but how is money "earned" from Kickstarted taxed?


As someone who just spent a lot of time learning Django, with it being the first web framework I've ever used, thank you! The current system resulted in some serious headaches while learning, and these changes will definitely remove a huge barrier to entry for people like me.


Thanks for doing this Andrew - our company uses South daily, and I've griped about a lack of a core Django migration capability from day one. It's a pleasure to support such a specific and useful project. Money well spent.


You're more than welcome, thanks for contributing! It's companies like yours that help such a good ecosystem exist.


How is this different from the Ruby on Rails way?

I have been away from the Django project for a long time, and I really like the Rails way. I have supported anyway, everyone deserves a proper migration system.


South is actually based on ActiveRecord's migrations a little bit (or rather, it was 4 years ago) but the key difference is autodetection of changes - Rails models are essentially declared via migrations, while Django ones are created in a separate models.py file, so there's a need to detect when changes are made.


Is it anything like DataMapper's[1] auto_upgrade!() and model definition approach?

1. http://datamapper.org/why.html


This is a no brainer donation in so many ways... from the obvious of helping you complete this which will benefit so many people directly, to helping establish this very fine precedence!

Thanks and good luck!


For stuff like this, why don't people just set up their own donation site? The target audience isn't necessarily off-put by other steps, and you can get past the kickstarter fees.


Kickstarter fees are something like 5%, right? (And a further 5% for payment fees, but you'd have to deal with that anyway.)

Out of the goal, that's 125 pounds. How many man-hours do you think it would take to set up the site, test it, secure it, etc? I can't see that ever being worth it for relatively small goals like this! And that's before you even consider that folk are probably more likely to feel comfortable donating to a kickstarter, since they already have an account etc.


a html web page and a paypal / whatever / ... account are enough come on ...


The Django Software Foundation and I considered it, but Kickstarter is really well-designed for making new projects and is a trusted brand for payments. Building our own would have taken both time and having to get a merchant account working, which is not a good use of resources, really.


First thing I thought while reading this kickstarter was "Why re-invent South.. it's already perfect!". I changed my mind when I understood who the author was. To be honest, I don't fully understand why the new version is needed.. we should just include South in the Django core, problem solved. But hey, I'm no expert in schema migration and I'm happy to help the author moves forward with his project. The only complain I had with south was that it wasn't installed by default ;-)


Andrew spoke about the current situation at DjangoCon last year (2012) not sure how much still stands out of this video but its good for a general overview of the current state and his plans for the core integration - http://www.youtube.com/watch?v=gwP7zLDDdPA


Great idea! First time pledging to kickstarter. I think there are a lot of improvements to be made related to database migrations. I built DevJoist (devjoist.com) with a similar goal in mind, but primarily for PHP developers. Multiple people have pointed to Django south as a great solution, so I'm happy to hear you're going to make it even better. If you ever want to chat hit me up! Email in profile.


Speaking of migrations, I just got a super-simple migration system merged into Chicago Boss!

https://github.com/evanmiller/ChicagoBoss/blob/master/README...

Well, almost...it's still missing one pull. Even if it's very, very basic, I'm pretty happy to have something rather than nothing, though.


Mybatis migrations is also a great alternative for schema migrations

http://www.mybatis.org/migrations/index.html

It is made by the Mybatis team but it not dependent on the Mybatis object mapping framework or on Java. It runs from the command line.


Wow, I haven't been following Kickstarter projects as closely before, but this seems to be rising quite quickly!

I guess all the right ingredients are here: someone competent providing a useful service with an achievable goal. Nicely done and good luck!


So, it looks like this is going insanely well, as a kickstarter.

What do you plan to do with the excess money, if you raise more than the highest, £7000 amount? Donate the remainder to the Django Software Foundation, hopefully?


If it gets that high, I'll likely give some back to the DSF, yes, or put it towards other core feature projects. Or perhaps just have the next DjangoCon sponsored by "Schema Migrations".


Looks like it's already almost at £10k.

Can/Do you have plans to add any additional features in addition to the existing 'stretch goals'?

I'm sure everyone has their own pet wishlist for Things I wish South Did Better, and I wonder what an appropriate fora for suggestions might be.

Mine would be some sort of 'keyframe'/checkpointing, a bit like the full history reset hack[1], but with more control over exactly where you split things.

[1] http://stackoverflow.com/questions/4625712/whats-the-recomme...


I love kickstarter as much as the next guy, but this is stupid. You just threw 10% of your donations away. I don't think Kickstarter is necessarily the best tool for this kind of fundraising.


These days when I see a project with the word "Kickstarter", it gives me an immediate understanding of the basic funding model and the mechanics of it. I'd be less likely to donate if it were just a "donate with paypal" link on some blogpost.

I don't know whether that sentiment scales to a broader audience, but I strongly suspect it does.


I just backed this Kickstarter, good luck on hitting your stretch goals. While South is decent enough, I've always thought it needed replacement or a lot of renovation.


This looks awesome. One question, why not release it as a third party module first and then merge it into the core later ala staticfiles?


wow! More than 10.000 pounds. Are you creating a new way to support Open Source software? Hope this catch on for other projects.


I hope someone can invent an SQL diff engine, e.g. input two files of creat table sql and output the alter table sql.


this is great. i'm planning on contributing to this and will try and convince some of my colleagues to do so as well. South is very handy and much appreciated but undoubtedly it can be improved alot and should be part of django core. This project will be wonderfully useful to the whole community.


What is bizarre is I saw it, thought it was great, came to post it on this site, and boom, was far too late.


What about using SQLAlchemy Migrations?


Django ORM isn't based on SQLAlchemy's core (there were some proposals concerning this idea https://groups.google.com/d/msg/django-developers/0IuJssTt8t...), so I guess using Alembic or other SQLA migration tool would require some frankensteining to make it work (so it listens to the Django models.py definitions) and bringing in external dependencies, while Andrew's proposed solution is to make it native and more automatic.


Exactly, the value here is in linking it directly to the models and apps paradigm of Django. South already does this, and people love it, so it seems reasonable to continue down that path and make it even better.


Just wanted to say thanks for South, it makes my projects a whole lot easier. Looking forward to this.


Thank you, this was absolutely necessary. Can't wait for it to release.


Some stats on the traffic sources to this page. Looks like Twitter was important to getting this project funded:

https://bitly.com/ZhmXuw+


How about this:

- I use South which is pretty good

- I use a NoSQL db. I will lose some things, but gain a lot as well (and in my point of view, the gains are bigger)

I'm not sure 'buying features' of software is a good way to go. (I'm not saying it's bad)

Edit: Why the hate? It certainly is better to have an integrated solution, I'm not saying this proposal is bad, only that there are other alternatives.

And the point is moot, donations went over the target already, hopefully this is going to be something big!




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: