Pensieri di un lunatico minore

1 June 2008 Programming

Sacred cows

It’s amusing as all hell to watch some people blow up about the MagLev Ruby implementation. Personally, while I find Ruby interesting, the performance is embarrassingly bad for a lot of problem domains. That doesn’t mean it’s not useful, it just means it’s not as useful as it could be. I’d like to take a couple comments that Charles Nutter made and just “observe”:

First off, they demonstrated its distributed object database automatically synchronizing globally-reachable state across multiple VMs. It’s an amazing new idea that the world has never really seen…

except that it isn’t. This is based on existing OODB technology that Gemstone and others have been promoting for better than a decade. It’s cool stuff, no doubt, but it’s been available in Gemstone’s Smalltalk product and in their Java product for years, and hasn’t seen widespread adoption. Maybe it’s on the rise, I really don’t know. It’s certainly cool, but it’s certainly not new.

There are only a tiny handful of companies in the world who can show production object environments of that scale. Gemstone and Objectivity being two of the most successful. Both are rooted in the Smalltalk world originally, though offer other products to most of their customers. What I find troubling is the dismissive nature that Mr. Nutter has about the problem domain. The fact that most people worship at the relational temple doesn’t mean that it’s actually a good idea in many cases. Some of the most complex databases in the world are object databases. There’s a reason a huge percentage of Objectivity’s money comes from the intelligence community. Data fusion is everything.

Except that these are results reported entirely in a vacuum. Whether this is fib following the “rules” of Ruby is entirely an open question. Whether this is method dispatch adhering to Ruby’s call logic is entirely an open question. Whether this is a while loop using all method calls for its condition and increment steps is an open quesetion. Because the Maglev guys haven’t started running Ruby tests yet. Is it Ruby?

Well, once someone actually defines what Ruby is we can have that discussion. One of the most frustrating aspects of Ruby is that there is no actual language definition. There is only a language implementation, several at that, and if I recall correctly all have had to reverse “engineer” the language from the implementation. This makes discussing “is it Ruby” a laughable prospect, and Mr. Nutter should know that. Let’s examine “some words of his from just a few weeks back”:

Compatibility is hard. I’m not talking a little hard, I’m talking monumentally hard. Ruby is a very flexible, complicated language to implement, and it ships with a number of very flexible, complicated core class implementations. Very little exists in the way of specifications and test kits, so what we’ve done with JRuby we’ve done by stitching together every suite we could find. And after all this time, we still have known bugs and weekly reports of minor incompatibilities. I don’t think an alternative implementation can ever truly become “compatible” as much as “more compatible”. We’re certainly the most compatible alternative impl, and even now we’ve got our hands full fixing bugs. Then there’s Ruby 1.9 support, coming up probably in JRuby 1.2ish. Another adventure.

So before the stones are cast, perhaps the house should be repaired?

Then there’s Maglev. Like the other impls, I’m excited that there’s a new possibility for Ruby to succeed. A high performance, “scalable” Ruby implementation is certainly what this community needs. But unlike most of the other implementations, it seems like Maglev is pushing performance numbers without compatibility metrics; marketing before reality. Am I far off here?

Here’s the thing. Maglev is a new language on top of an existing VM. One that has over a decade of hardcore production use, something that even Java can’t really claim. There’s a wide gulf between putting a new syntax on top of an existing VM architecture and building a new VM. VMs are intensely hard, and only a few people in the world are truly good at them. Dynamic VMs all derive from Smalltalk, and even Java is having to glue on all sorts of silly doo-dads to try and pretend it can support it without huge cost. Ruby is very much Smalltalk in different clothing, and therefore I would expect Ruby to run at near Smalltalk speeds once you clean up some pieces. I know the great wizard had a Python implementation running on a Smalltalk VM, alas it never got released.

The truth is that not all of these optimizations are kosher right now. Removing the ability to override Fixnum#+ certainly makes it easier to optimize addition, but it’s not in the spirit of Ruby. Removing frames may be legal in some cases (like this one) but it’s not legal in all cases. And of course I’ve blogged about how Thread#kill and Thread#raise are broken, but we have to support them anyway. On and on we can go through lots of optimizations you might make in the first 100 days of your implementation, only to back out later when you realize you’re actually breaking features people depend on.

Features, or design errors? Because if it’s the later, then breaking them is the best thing that can be done. The refusal to fix them in the past is simply a weakness. The totem of backward compatibility is what gave us Windows Vista.

This entry was posted at 12:19 pm on 1 June 2008 and is filed under Programming. You can follow any responses to this entry through the post-specific RSS 2.0 feed.

Not sure if this comment goes through ;)

A few regards:

“once someone actually defines what Ruby is we can have that discussion.”

First, I believe the one that defines Ruby is of course the author himself.

It wasn’t him who started the ruby-spec stuff, it was as far as I know the Rubinius guys (or JRuby? Dont recall really, as long as they work together too instead of against each other, it doesnt matter anyway).

From a user’s point of view, anything that “ruby” (the binary) understands is valid ruby code. A spec can either reach this 100% or not (or it can even reach something different, like target a new language.)

“There is only a language implementation, several at that”

The only one that really matters in the end – sorry guys – is the one that matz “defines” (as in, develops, maintains etc..).

‘This makes discussing “is it Ruby” a laughable prospect, and Mr. Nutter should know that.’

I believe he has a valid point if we all concentrate that there is only one real ruby implementation in fact – the one by matz.
The others are great for the future but in the end, matz-ruby will always be the “goal”, unless one wants to create a new language, which I think would be bad. It could at worst split a community, and this is something that should be prevented.

‘Ruby is very much Smalltalk in different clothing’

I totally disagree here. From which point is it?
I don’t know how one can come to such a conclusion, honestly.

Ruby is much more like an advanced perl with OOP done right. I think Ruby has a lot more in common with python than with smalltalk.
I am not so much talking about the OOP, I am talking about people that use ruby (or python, or perl, for that matter).

There are more differences, as in the whole VM/Squeak stuff. Or let’s take a framework like Rails. I really don’t feel this is in any way similar to Smalltalk.
It is much more competitive with python/django too, than with anything from the smalltalk world.

It is a much more bottom up approach as well, as in, one is much more likely to replace shell scripts and similar with ruby scripts, and from there, work “upwards”.
I also think Ruby is more complex than Smalltalk. But people can feel free to flame me for that too. ;)

‘I would expect Ruby to run at near Smalltalk speeds once you clean up some pieces. ’

Sounds interesting but I have doubts this will happen in the next… let’s say 3 years.

‘Features, or design errors? Because if it’s the later, then breaking them is the best thing that can be done. The refusal to fix them in the past is simply a weakness. The totem of backward compatibility is what gave us Windows Vista.’

Which one specifically? But first let me state that I believe Vista was purposely crappy in order to push for the next generation stuff (including the Video/gaming sector for the new direct X)
You just cant compare this to a programming language. Microsoft makes decisions totally different.

Now, about “design errors” – there are none. All must be compatible with the ruby implementation. If one isn’t, it is simply a bug.

I think we should concentrate on what matters in the end – ruby (and ruby code), for the ruby community as a whole.

It is true that right now one can not test Maglev, or? So maglev seems to be behind the other contenders that can be tested at the moment IMO.

>> Dynamic VMs all derive from Smalltalk

If you’re going to make such an outrageous claim, you should try to back it up with some evidence.

Regardless of what Ruby is, a Ruby implementation will need to be compatible with other Ruby implementations to run current scripts. If this implementation can’t do that, people are going to say it doesn’t implement Ruby properly. “All these other Ruby implementations run this fine, so why can this one? They must be cheating by taking some shortcuts.”

But the main point on the Headius blog was that there has been a lot of hype despite pretty much nothing actually being available. There’s not much evidence this thing actually even exists. You didn’t mention that at all for some mysterious reason. “Ignore that stuff and look over here at this thing!”

People should wait until something is actually available before getting so excited. (Though it certainly may be something to get very excited about in the future.)

First, I believe the one that defines Ruby is of course the author himself.

That this definition is contained within, and only within the author’s mind is troubling for the expansion of the ecosystem.

It wasn’t him who started the ruby-spec stuff, it was as far as I know the Rubinius guys (or JRuby? Dont recall really, as long as they work together too instead of against each other, it doesnt matter anyway).

The fact of the matter is that this process is necessitated by the black-box nature of the actual language definition. Since it’s not actually written down, it must be reverse-engineered.

From a user’s point of view, anything that “ruby” (the binary) understands is valid ruby code. A spec can either reach this 100% or not (or it can even reach something different, like target a new language.)

I believe that you and I are working from a different meanings to the words specification and implementation. cRuby, to steal a Python idiom, is a concrete instantiation of an abstract idea: Ruby, the language. The fact that the implementation “understands” a specific construct may, or may not, be indicative of that construct existing in the language itself. Without a definition, we are left unsure of intention.

I believe it was Matz himself who said: “If you like Perl, you will like Ruby and be right at home with its syntax. If you like Smalltalk, you will like Ruby and be right at home with its semantics. If you like Python, you may or may not be put off by the huge difference in design philosophy between Python and Ruby/Perl.”

Syntax and semantics are very different beasts, and my implication was that Ruby, more than Python, embodies much of the Smalltalk semantic model.

Ruby is much more like an advanced perl with OOP done right. I think Ruby has a lot more in common with python than with smalltalk.

I am not so much talking about the OOP, I am talking about people that use ruby (or python, or perl, for that matter).

What do the people have to do with the language? That’s a definition of community, but it has little to do with the language itself. The language is most definitely from the Smalltalk semantic model more than any other. The syntax simply looks different.

I also think Ruby is more complex than Smalltalk. But people can feel free to flame me for that too. ;)

Definitely. Ruby is festooned with corner cases and syntactic sugar. Smalltalk has almost none, and had effectively none in the early days. For an understanding of what I mean by a design, one need only read Smalltalk-80: The Language and Its Implementation by Adele Goldberg and David Robson, which is often referred to simply as “the blue book”. Some of it is found online, and contains very detailed models of the language. While the pieces in the previous link are often actually implemented differently, the user-facing design is effectively unchanged.

Now, about “design errors” – there are none. All must be compatible with the ruby implementation. If one isn’t, it is simply a bug.

That’s a religious argument, but I would argue that all languages have design flaws when viewed from certain perspectives. Lisp has the purity of data and code, but the idiosyncrasies of special forms. Smalltalk has the purity of message syntax, but many see the image as part of it’s impediment. I’m quite sure that Matz himself sees things he wishes he’d done differently, or intends to in the future. If there were no design errors, then there would be no need for anything but bug fixes and performance improvements. I think the current situation with Ruby v1.8 v. Ruby v1.9 is testament that there is more than bug fixes to be done.

JJH: If you are familiar with another lineage of dynamic VMs that are not derivatives of the approach taken with Smalltalk, I’d be interested. While one can use other models, and some languages do, they all end up converging on that approach because it simply works for the needs of a dynamic language.

To me, Smalltalk is a programming language, not an implementation or an implementation design. But I’m much more anal about this topic than most people, so your terminology was probably fine. I see now what you were saying.

Features, or design errors? Because if it’s the later, then breaking them is the best thing that can be done. The refusal to fix them in the past is simply a weakness.
Right you are, but the errors must be fixed in a major new version to avoid breaking compatibility, good or bad, with current usage. Ruby 1.8.6 implementations must behave exactly like the Ruby 1.8.6 MRI reference implementation, not ifs, buts, or third party “improvements”.

The threading model Nutter referred to has indeed been fixed, breaking previous code, in Ruby 1.9.0. That is the proper way to deprecate the errors of the past – in new versions. Ironically enough, this lesson is very well illustrated by the experience of the Smalltalk language, which is plagued by incompatibilities between the various implementations. I strongly agree with his principle that this outcome is to be avoided at all costs and implementations must hew as closely as possible to the reference, for better or worse.

Otherwise, good article! I for one welcome the experience and technology a company like Gemstone can bring to the Ruby community, although obviously I’d prefer it if the demonstrations they give were more, uh, complete. Regardless, this seemingly-overnight realisation by the Ruby community that there might be decades’ worth of useful knowledge on VMs and performance hiding in the Smalltalk community’s plain sight can only be a good thing for both languages in the long term.

You can leave a response, or trackback from your own site.