Pensieri di un lunatico minore

16 July 2005 Smalltalk

The joys of Smalltalk

So I’m working on some various parsing doo-dads, mostly in Smalltalk because it’s what I like most in life. One of the things that I’ve gotten used to in Python, Perl, Ruby, whatever, is the ability to quickly convert a delimited string—let’s say with my favorite delimiter, the pipe (|)—into an array of some form. For Pythoneers, it would myString.split('|'). Very simple, and nearly identical in most other languages.

Smalltalk doesn’t have this obvious functionality, but it does have upTo:, which will return a stream up to a specific character. More importantly, it has upToAndSkipThroughAll:, which returns the bit up to a specific character, and then skips over that character. Unfortunately, I hate how that’s spelled, and I hate having to loop over it.

So what to do? Extend the class model, of course! Ta’da!

splitOn: aCharacter expectedCount: anInteger
    | aCollection |
    aCollection := OrderedCollection new: anInteger.
    [self atEnd] 
        whileFalse: [aCollection add: 
                     (self upToAndSkipThroughAll: aCharacter)].
    ^aCollection.

Now, that’s not quite splitOn:, is it? Well, after doing some minor benchmarking with my application, I discovered that I was needlessly resizing the OrderedCollection. I happen to know how big it’s going to be, since it’s a regularly formatted line. So I can pass that in. If I don’t want to pass it in, I can just do this:

splitOn: aCharacter 
    "Answer back an OrderedCollection of the various tokens."
    ^self splitOn: aCharacter expectedCount: 25.

This is a common idiom in Smalltalk. Because all selectors must be fully qualified out, you can’t have “default parameters” as you might in Python. The idiom is to implement other selectors as chains to the most detailed one. This actually is, in many ways, more cleanly specified than Python, but it’s a bit more typing sometimes. Fortunately, I save so much elsewhere that it’s not something I sweat.

Now, I can just type:

tokens := aString splitOn: '|'.

Viola! I’ve now gained this on any Stream in the system. Try changing a core class in your ossified language of choice.

Nota bene: Ruby can do this as well, if not nearly as dynamically.

Ack, as someone on #smalltalk pointed out, this is actually already in SequencableCollection as tokensBasedOn:. I swear I searched, since I thought it was there, but guess it was late at night and I was dead. Oh well, still easy to add.

This entry was posted at 12:56 am on 16 July 2005 and is filed under Smalltalk. You can follow any responses to this entry through the post-specific RSS 2.0 feed.

I knew there was a way to do this too, but I couldn’t find tokensBasedOn: either. This is one of my complaints with the Smalltalk libraries—finding what you want can be difficult if you are at all rusty or don’t think/speak the same way as the particular class authors. There has also been resistance to adding synonyms for other methods. Me, I’d rather bloat the library and be able to find what I’m looking for. Did you consider creating a split: method that calls tokensBasedOn: now that you know it’s there?

Both comments and pings are currently closed.