Pensieri di un lunatico minore

27 September 2005 Technology

Least common denominator

Over the past few years, I’ve worked with a bunch of relational database systems: Oracle, PostgreSQL, Sybase, Informix and MySQL. Each of them has strengths and weaknesses, but one stands out as the “dumbest of the bunch”: MySQL. It’s missing huge swaths of the SQL standard, implements much of the rest of it with the kind of slip-shod quality that one expects out of General Motors, and manages to entice the geeks of the world into a sense of accomplishment—without the actual accomplishment.

This came to mind as I read a post on database-agnostic schemas for Rails. I understand the alure of “database independence,” but it’s a false sense of success. What you’ve gotten, if you include such toys such as MySQL, is a “least common denominator” situation. Let me explain how this situation plays out as an example.

In Rails, the system can introspect into the database to find all sorts of things—no need to define column names, etc. This is wonderfully useful. Unfortunately, it imposes a naming convention to find the primary key (it must be called id or you have to be explicit), you have to tell it how to find other tables, etc. All of this is an artifact of its originaly design with MySQL in mind. MySQL (until 5.0) doesn’t have a true data dictionary—and honestly who knows how braindead their new implementation is—so you couldn’t introspect into finding all the foreign keys, primary keys, constraints, references, etc.

Rather than encouraging the construction of proper database architectures, such as those with in-database constraints, referential integrity, stored proceedures, what we have is another level of people who are doing lots of things in their application that should be done in the database layer. Ugh.

Just a rant for the morning, but everytime I see someone rave about MySQL, all I can think of is all the times it’s shot me in the foot when it’s not done what every sane database does, or has some half-assed implementation. This isn’t rocket science, and it’s been well-documented since the 1960s. Why must we keep making the same mistakes over, and over, and over, and pretending they’re “new”.

This entry was posted at 9:57 am on 27 September 2005 and is filed under Technology. You can follow any responses to this entry through the post-specific RSS 2.0 feed.

“In Rails, the system can introspect into the database to find all sorts of things—no need to define column names, etc. This is wonderfully useful. Unfortunately, it imposes a naming convention to find the primary key (it must be called id or you have to be explicit), you have to tell it how to find other tables, etc. ”

I am a HUGE PostgreSQL fan. This obstacle is overcome with one line of code in your model. If you choose to use id as your primary key, you don’t have to add that one line. It’s a convenience… and I think many of the people who bring this up don’t realize that it’s not a big deal. :-)

I mostly agree about MySQL, but I think there are some people using it who are extremely competent, aware of its flaws, and maximizing its strengths. Brad’s presentation on the LiveJournal architecture, for example, gave me the impression that the way they use MySQL is more “fast data store we understand comprehensively” than “standards-compliant relational database”.

How dare you say something negative about Rails! Fear the wrath of DHH!

“This isn’t rocket science, and it’s been well-documented since the 1960s. Why must we keep making the same mistakes over, and over, and over, and pretending they’re “newâ€?.”

Ha! You should take a look at ColdFusion one day (e.g. no writing your own functions until version 5, and they pitched it as “UDF”—user-defined functions).

P.S. The first time I tried to submit this, I got:

WordPress database error: [Illegal mix of collations (utf8_bin,IMPLICIT) and (latin1_swedish_ci,COERCIBLE) for operation ’=’]
SELECT comment_ID FROM wp_comments WHERE comment_post_ID = ‘1789’ AND ( comment_author = ‘Joe Grossberg’ OR comment_author_email = ‘josephgrossberg@hotmail.com’ ) AND comment_content = ’\”This isn’t rocket science, and it’s been well-documented since the 1960s. Why must we keep making the same mistakes over, and over, and over, and pretending they’re “newâ€?.\” Ha! You should take a look at ColdFusion one day (e.g. no writing your own functions until version 5, and they pitched it as \”UDF\”—user-defined functions).’ LIMIT 1

I guess you’re in a glass house on this one.

I didn’t write this shit. :-) Although honestly, Wordpress seems to be more flakey than most, and I really need to have Textdrive migrate my backend to Postgresql. They just make it harder than it should be.

I miss running my own stuff.

It is one thing to talk about “new applications” that are written with Rails, but when we you discuss retrofitting things onto an existing schema that is not modifyable, Rails starts to have some massive issues. Now, I accept that it’s not within it’s real “sweet spot,” and that’s fine, but it would do a lot better were it to actually use the data dictionary to find these things out.

Opinion:

[Insert database/application/etc. name] is an excellent choice for a certain subset of problems. However, it is a poor choice for other problems.

The adage about a carpenter knowing his tools and having a large number of tools to draw on is the most appropriate, yet least used approach when constructing applications/systems/etc.

The specialist, per Mills, is the individual who continually focuses on an ever-decreasing scope approaching epsilon until at some point the specialist knows nothing about anything.

It’s definately one of those “one of these things is not like the other” situations. Rather than dumbing the software down for a poor excuse for a database (really more of a structured file), you should write for the real target, and if you feel some over-arching need for portability, something that’s severely over-rated around databases, then you should fill in where other’s lack.

The truth of the matter is that portability between most full-featured databses is pretty simple, and mostly a matter of syntax, and sometimes more work on stored proceedures. When MySQL is injected, you have to rip out the data-integrity component that should be in the store, and put it in the applications. That’s a piss-poor decision.

Glad you said it in print!

Most people don’t realize how feeble MySQL really is.

And let’s not kid ourselves: the reason it’s become popular is because it’s fast, easy, and most of all free! Which is more than we can say of any “real” databse (though I have no experience with Postgres). What DB do you recommend that meets those requirements?

Forgot to add: There used to be a section in the MySQL docs (can’t find the link), justifying their lack of transactions, and saying something like:

“Data can always get lost. You always need backups, even if you were to have transactions. Even the famed Oracle is reputed to have lost data…”

...whew

MySQL is free if you obsess over the GPL and ignore everything else. It’s not free if you’re in anyway “for profit.” PostgreSQL, which is comparable to Oracle for most applications (80% or more) in capabilities is licensed under the BSD license, which is basically “do whatever.”

MySQL is perfectly free for use if you want to make money with what you’re doing – the difference is you cannot include MySQL or use the embedded MySQL database (I believe), if you aren’t writing GPL software.

It’s a bit of a myth that you have to pay MySQL AB to use their database in “for profit” software. Instead, you can pay MySQL AB for their support, or to distribute it easily with your product. That’s all. So for developers, and many competent administrators, it’s entirely free to develop and deploy with. And that’s why it’s popular.

MySQL AB used to be very open about the differences between the GPL and Non-GPL licenses for their product, but since the new website redesign focused at enterprises, they’ve started to hide, or not talk about this.

Instead, they’ve rebranded the two, to make businesses want to purchase the “Network Edition” instead of using the “Community Edition”, see: http://www.mysql.com/network/compare.html

Which is fine. It just means that if you want something for free, you’ve got it, and if you want to pay for something with the assurance of support and all the extra perks, you can do that too. It fits Rails / “Get Real” perfectly – you can start simple, and “scale out” as you depend on it more.

MySQL has had transactions and foreign key constraints for several years. Extracting primary key and referential integrity data through DESCRIBE TABLE commands works fine, and the ORM framework I use supports this and uses this data. No naming convention requred. It’s fair to complain that this data isn’t available in a standard SQL way (unless you use 5.0), but saying it isn’t available at all is incorrect.

This is also why Rails isn’t as DRY as it would like to be. As Robby Russell mentions, it’s fairly easy to add support for foreign keys to your model, except that now you are defining your model in two places… definitely a case of repeating yourself. I think there may be technical reasons for compromises, but to claim they aren’t compromises (and even coin phrases to add legitimacy: i.e. “single layer of cleverness”) borders on dishonesty.

My feeling on MySQL is much the same as many programmers feel about BASIC (or these days, PHP): it causes permanent brain-damage in otherwise apparently smart people.

[...] 1 I suspect this is another latent sloppiness inherited from the MySQL brain-damage. MySQL, at least as late as 4.1, doesn’t really integrate transactions and DDL, and never will outside the InnoDB engine. This means you can’t actually have atomic schema migrations. Bad juju if you ask me. [...]

MySQL was lacking in certain aspects until its more recent incarnation (perhaps still is) – But lets put the boot on the other foot, to many system have limitations due to the overkeen usage of referential integrity and stored proceedures.

Unless theres multi-tier modelling reasons for this, processing and data validation should be performed within the top layer – burying this deep down makes them difficult to debug and more expensive to maintain.

Actually MySQL still, in my experience, has all sorts of bizarre bogosity related to things it’ll accept and silently do the wrong thing without warnings. That’s just unforgivable.

Referential integrity, etc., can of course be over-used, and some people might argue it should never be used, but I see it as a last line of defense for data. There are simply certain values, certain behaviors, that are never acceptable. Ever. Also, much of what I do ends up being an integration database for various reasons outside my control, and therefore the DB is one of the few places to enforce data integrity.

If everything was a trivial PHP app, then the world would be different. But they’re not, fortunately.

[...] Christopher Petrilli repete um frequente mal-entendido sobre Active Record, o ORM de Rails, em Menor Denominador Comum. O raciocínio diz que o MySQL está nos segurando de tomar vantagem de funcionalidades mais avançadas dos bancos de dados disponíveis no PostgreSQL, Oracle e outros. E que se pelo menos MySQL fosse mais esperto, tivesse mais funcionalidades, estaríamos abrançando-as de braços abertos. Errado. [...]

Both comments and pings are currently closed.