Pensieri di un lunatico minore
In the continuing adventures of the most epic of clueless “architects”, comes this great little snippet:
Your business customers want you to scale your Smalltalk application to 768 Coherent CPUs, how would you accomplish scaling on the same box and beat Java?
Gee, I don’t know, space cadet, how do you beat a box that’s hardware-designed, tuned and engineered only to run Java? What other language in this modern age needs a box like that?
Seriously, get a clue. If you can’t read, don’t critique. If you can’t think, go away.
4 thoughts
In an earlier post, I had mentioned that I wanted to use the Chronos library for some work, but was hamstrung because it didn’t support parsing of text strings. Well, now that has changed, and it looks like it can handle most things you throw at it.
One of the things that I always have fun parsing is the brain-dead syslog file format, but I think this can handle it now. The only thing that leaves is being able to interpret relative time, like “tomorrow” or “next Tuesday” and turn that into an interval perhaps?
6 thoughts
Earlier, I wrote about extending classes, I wanted to give a clear example of what I personally think is a good use for this. I’m working on some time-travel stuff. As I’m writing some tests to talk about what I want to the code I’m working on to do, I get to this:
TimeTravelTest>>testToday
| stock version |
stock := Stock new.
version := StockVersion price: 12.25.
stock versionAt: (Date readFromString: '1/12/2006') put: version.
self assert: (stock versionAt: (Date readFromString: '1/12/2006')) price = 12.25.
That works just fine, but it doesn’t really read very well. So, I created a class extension, and added a new message asDate to CharacterArray that just looks like this:
CharacterArray>>asDate
^Date readFromString: self.
so that I can do this instead:
TimeTravelTest>>testToday
| stock version |
stock := Stock new.
version := StockVersion price: 12.25.
stock versionAt: '1/12/2006' asDate put: version.
self assert: (stock versionAt: '1/12/2006' asDate) price = 12.25.
That, to my mind, reads much better, and is a clear example of how extending a class can be a “good thing,” if used correctly. It actually creates a situation where the flow of the code is clearer and more explicit about what you are trying to do. I haven’t actually introduced any new dependency between the two classes, but simply made one more expressive.
I would love to use the Chronos Smalltalk library, but as of yet it is not capable of parsing strings, as near as I can determine, even though it can do everything else, including dealing with leap-seconds. Converting from the VisualWorks Date class to a Chronos class seems silly at this point.
4 thoughts
I admit it: I hate assembler. I hate C, C++, and most everything else other people consider “programming languages.” They are glorified switch-flipping on the front panel of a PDP-8e that we no longer have sitting in front of us. They are tuned for the benefit of the computer, and honestly, as an arrogant bastard, I find my time infinately more valuable. Computers are servants. Technology must bend to my will, not the other way around. That’s just the way I am.
So, it is with great glee that I find the resurgence of discussions about more advanced programming languages. Take an article from eWeek published yesterday.
The man from Mars sees development languages being chosen for the convenience of machines, despite attendant productivity penalties and difficulty of delivering high-quality code, instead of being chosen for the convenience of the developers who are the actual scarce resource.
Continually, I have reminded people that talented developers are expensive, CPU cycles are asymptotically approaching free. They are not being “wasted” when we burn them to make the developer’s life easier—we are freeing his own cycles to contemplate better algorithms, more advanced approaches, more adaptive reasoning.
Like anything that’s been around for several decades, LISP carries the baggage of what “everyone knows” about it that is no longer true.
“Everyone knows,” for example, that LISP is an interpreted language and, therefore, too slow for production applications—except that modern LISPs can compile functions for run-time speeds competitive with those of C or C++ programs in algorithmically complex tasks.
I remember the first time I ran into LISP in a high-performance computing environment. This was in the days when Cray ruled the world of supercomputers, and every cycle was expensive. Some of the fastest programs ever written for a Cray supercomputer came out of their LISP compiler. Why? Because the algorithm could be the focus of the intellectual power of the developer, and not bit-twiddling. It was a shock, and yet logical all at once.
The trade-offs are clear. In a study performed in 2000 by Erann Gat, a researcher at the California Institute of Technology’s Jet Propulsion Laboratory, programmers writing in LISP produced programs with less variability in performance than more experienced programmers writing in C and C++.
The fastest versions of C and C++ programs were faster than most LISP implementations, but the median performance of the LISP implementations was actually twice as good as the median performance of the C and C++ code performing typical tasks (more at www.flownet.com/gat/papers/lisp-java.pdf).
For real-world teams, such reduction of technical risk and improved worst-case scenarios arguably outweigh best-case results.
And that’s the thing. We live in the real-world, and that world is swimming in CPU cycles, memory, and un-tapped resources of our systems. Use them. Don’t waste your time worrying about saving 1 cycle when it means the real problems go unsolved. You might find out that, by focusing on the underlying problem, that the solution comes sooner, with more certainty, and thorugh more interesting work.
These observations to me also apply to Smalltalk and other “research languages,” that people dismiss as “interesting,” but not fast enough. Ruby is another such beast, as is Python, or even Perl—trading “performance” for expressiveness. Worry about “performance,” in it’s myopic traditional measure, when, if you solve the underlying problem first.
Fill your tool-belt with all the tools you need to solve the problem. You might be able to build the Taj Mahal with a pair of dental tweezers and an ice-pick, but it sure won’t be much fun.
[via Phil Windley’s Technometria]
2 thoughts
One of hte hardest things to deal with in computer systems is time. While time may be monotonic in the abstract, unfortunately when humans get involved we get the morass of time-zones, time-saving silliness and other things that complicate the ability to work with information that exists temporally. As an off-shoot of a discussion on the Seaside mailing list about money (another strange thing to deal with), was a pointer by the taltented Cees de Groot to an excellent paper: Time Travel: A Pattern Language for Values that Change (PDF). Written by some very smart people it details how to deal with the thorny issue of values and time. The examples are written in Smalltalk, but I suspect most anyone can read them.
No thoughts
There seems to be a bit of a disagreement going between Chad Fowler and Ian Bicking about what Pythoneers call “monkeypatching,” Ruby people call “opening a class” and Smalltalkers call “doing the right thing.”
What do I mean? Well, in a bit of commentary by Ian that was sent to Chad, he states:
The use I object to that I see in lots of Ruby examples (and maybe isn’t indicative of most real Ruby code) is when people add methods to other classes that aren’t meant to fix anything, but just because they don’t have an object of their own to hang the method off of. I’ve seen several examples where people add methods to Array to implement some recursive algorithm, instead of using a function.
This is a blurry line in the world of truly dynamic languages, and it reminds me of the absurd notion of final in Java. There are times when I need to extend an existing class to provide additional functionality because that is where it belongs. Sometimes people put things in the wrong place, but the idea is that they should just “write a function” isn’t necessarily the right solution either.
For example, there is a coercion idiom in Smalltalk where you use message names like asString to coerce one thing into another. It is totally appropriate to extend this to allow for further expressiveness of the language.
The important thing is that Smalltalk implementations have a clear way to do this—through class extensions, which can be loaded and unloaded as needed by other systems. Is it perfect? No, but the idea that someone in the past thought of everything that might be needed in the future is a grossly arrogant position to take.
Languages are tools, and the more they try and “protect you,” the more they limit your creativity. There are bad times to extend classes, and bad places to put functionality, but that is something one can’t make generalizations about, and has to be evaluated in the context of the specific system.
7 thoughts
James Robertson discusses a comparison of Ruby on Rails and Seaside, originated by Reg Braitwaithe, which I had read a few days ago and bookmarked for comment. Alas, work keeps interrupted my ability to post on here.
Anyway, the point I wanted to add was that Seaside doesn’t make the “easy” any “easier.” What Seaside makes possible is the nearly impossible, or unmaintainable application that would have been a mess of disasterous inter-dependent pages and logic. Instead, it allows you to incapsulate it all together in a clean manner. The curve flattens and you can now consider applications that weren’t really feasible before.
1 thought
There’s a whole big discussion on comp.lang.lisp that I became aware of through a posting on Lemonodor and further commentary by Dan Moniz. All I can say is: Get a freaking grip guys! How someone licenses their software is their decision, and whether you like it or not only affects whether you give them money, not whether you have to respect it.
If I want to give something away—and I have given away a lot of bits of code over the years—then that’s my choice. If I want to charge, then that’s also my choice. This kind of back-biting is totally counter-productive to both the problem at hand, as well as the Lisp community as a whole. There are some outstanding free implementations of Lisp. Are they as “good” as Franz’ Allegro Common Lisp, probably not. But they also don’t cost money, and as my mother once told me: you don’t get what you don’t pay for.
Is ACL expensive? Yes. Is it worth it? Probably. Are the guys at Franz flexible and interested in working with you? Yes, totally. They, along with the gang at Cincom are very flexible in coming up with licensing terms that make sense for everyone at the table. They want to make money, but they also understand that they don’t make money if you aren’t making money. Having said that, I think that Franz should follow along with Cincom’s lead in offering a non-commercial release that is free and without any limits other than its non-commercial bits. Even if they can’t include all their doo-dads and gee-gaws (due to paying other people money for licensing), they could probably share most of it. If not, perhaps it makes sense to share with the community why they can’t do that.
Just a few thoughts. We don’t live in a world where anything is truly free of cost—everything has a price—even if you don’t bear the brunt of it.
No thoughts
The continued progress with DabbleDB marches forward. In approximately 5 minutes, I just sat down and sketched out the beginnings of a requirements database for work. This isn’t designed to be a trouble-ticketing system, or anything like Bugzilla (fortunately!), but instead just to capture some marketing requirements, use cases, etc., so that we can use it to feed the engineering requirements for developers.
Still impressed. Now we just need one to have internal to the company.
No thoughts
Earlier, I wrote about my initial experiences with DabbleDB. We’ve now been using it for a week or so, and the initial impressions have only been enhanced. One of the bits that I’ve now had a chance to use is importing data from outside sources—like the proverbial Excel spreadsheet in our case.
The first import didn’t go so well, mostly because I didn’t notice how the spreadsheet used hidden columns to space things out. Woops! After I corrected that, I ran into two issues:
- Columns with numbers formatted with commas caused problems in parsing
- Header fields show up as entries.
The first I reported to Avi and Andrew, and was fixed in less than 6 hours. The second is one that’s harder to decide how to fix without introducing complexity to the UI that is not necessarily going to serve the needs of others.
So far I think the best comment I’ve heard internally so far was from the person who normally had to maintain all this data. When she realized what she could do and how easily, she went looking for new things to stick in there and keep track of.
For me, the highest praise is that it’s boring. I mean that in the best possible way. It just works, and that means there’s nothing for me to play with! Damn you!
No thoughts
Yesterday, I managed to score a preview account to look at DabbleDB, which is being written by Avi Bryant and Andrew Catton. The goal is to capture the long tail that is represented by the millions of “mini applications” and “mini databases” that are stored in Excel spreadsheets around the world. To put it to the test, I decided to implement my company’s internal asset database, currently stored in an Excel spreadsheet.
5 minutes.
Read that again. 5 minutes. All I had done was watch the presentation at the Vancouver Lisp meeting (Quicktime), and that’s it. Didn’t read any documentation.
So, that under the belt, I decided to also add our evaluation systems into it, and track all the equipment that goes in and out of the organization for evaluation by customers.
5 minutes.
Seriously. Most of that was spent thinking about what was needed: target end dates compared to real end dates, tracking information, etc. A few clicks, presses, and boom. Not only does it allow you to associate hardware with an evaluation, it automatically links up together multiple pieces of hardware on the same evaluation without having to try even making the slightest effort to tie them together.
There’s still a few missing pieces, obviously, but not only is the core concept excellently worked through, but the implementation is amazingly polished. Compared to some other tools I’ve used, it’s lightyears ahead in both usability and functionality.
Update: I found a minor bug in their Excel importing facility, and sent an email to Avi and Andrew. It was fixed in an hour or so and now works beautifully.
5 thoughts
Something like that.
I’ve been playing with Avi Bryant’s continuation-based web framework Seaside, which is written in Smalltalk. Wow. That’s all I can say. After some recent work with Rails, I had come to admire the cleanliness of the framework—even if, on occasion, I had some complaints about short-cuts taken that need not be necessary. Compared to Seaside, Rails seems to me to be a jalopy. Don’t get me wrong, it’s a seriously pimped out jalopy, but the easy with which one can build interactivity and modify it on the fly with Seaside is mind-blowing.
There is, however, a concern about integration with database systems—as there always seems to be when I have to deal with Smalltalk. I am looking at integrating through Alan Knight’s GLROP, which has some brilliant mapping ideas and largely allows for the disposal of mind-numbing shifts between Smalltalk and SQL. My concern however, is that one of the data-sets I need to work with is current housed in a MySQL database, and as far as I can determine, there’s no functional MySQL support and the one I had heard of is no longer available.
NB: Don’t take this as a slam of Rails, as it’s not. Rails is brilliant—for what it is. It takes the historical model of page interaction and data storage to new heights of simplicity. It doesn’t, however, change how you view the web. Seaside does. Whether you use it for your next project, or not, it’s worth looking at, going through the tutorials, and allowing your mind to conceive of a web that simply behaves more naturally.
4 thoughts
There is an odd and disquiting conversation going on on comp.lang.python about adding symbols to Python. Every other language I use regularly (Smalltalk, Ruby and Lisp) has the concept of a symbol in the language. The Wikipedia contains the following definition for a symbol:
A symbol, in its basic sense, is a conventional representation of a concept or quantity; i.e., an idea, object, concept, quality, etc. In more psychological and philosophical terms, all concepts are symbolic in nature, and representations for these concepts are simply token artifacts that are allegorical to (but do not directly codify) a symbolic meaning.
In Lisp, symbols are actually objects in the system, and to quote the hyperspec:
Symbols are used for their object identity to name various entities in Common Lisp, including (but not limited to) linguistic entities such as variables and functions.
In fact, because of the structure of macros in the Lisp world, you actually have ways to generate symbols that are held as placeholders for other symbols, in the form of GENSYM. For Smalltalk, the definition is a little different, but also it is a specific type of object (from Squeak):
Symbol is a subclass of String, and understands, in large part, the same messages. The primary difference between a symbol and a string is that all symbols comprising the same sequence of characters are the same instance. Two different string instances can both have the characters ‘test one two three’, but every symbol having the characters #’test one two three’ is the same instance. This “unique instance” property means that Symbols can be efficiently compared, because equality (=) is the same as identity (==).
And in Ruby:
Simply, a symbol is something that you use to represent names and strings. What this boils down to is a way to efficiently have descriptive names while saving the space one would use to generate a string for each naming instance.
The common theme that runs through all of these implementations is that symbols are really just placeholders. We don’t particularly care what they are placeholders for, only that we can make comparison decisions based on them, and the only comparison that matters is equality. No other manipulation really matters. (This is not totally true in Lisp, but without macros, the rest vaporizes).
In the Python world, a symbol is a name for something. For example, when you define a function:
def myFunction(x, y, z):
pass
The name of the function, myFunction is a symbol, as are the variables x, y and z. The idea exists, however because it’s never been formalized in the same way they have in other languages. Python even has a symbol module, but it’s really not the same thing.
Symbols can be thought of as a parallel namespace (or in the case of some languages, multiple name spaces, each attached to a package) with a whole set of strings in them which will always be the same. A veritable garden of global names. For example:
x = 'string'
x = :string
are two seemingly similar things in Ruby, but the second reffers to a symbol that will always be the same. The reality is that the Python world has been using strings as symbols for a very long time, but without a lot of the advantages in implementation that the exposure of real symbols can bring (lots of reduced evaluation context costs).
1 thought
When people are introduced to Smalltalk, they often come away a bit befuddled, and also feeling like some of their syntactic sugar addictions can’t be fulfilled. Take the wiley “case” statement that exists in many languages. Turns out you can implement it in Smalltalk via just simply extending the underlying classes. Poof! Try that in most languages and people will look at you funny. So far, Lisp and Smalltalk are the only two languages I’ve used that are so trivially extensible to meet new requirements. Others can be extended, but not so seemlessly.
Andy Bower, however observes in the linked wiki page:
When I first came to Smalltalk from C++, I couldn’t understand how a supposedly fully fledged language didn’t support a switch/case construct. After all when I first moved up to “structured programming” from BASIC I thought that switch was one of the best things since sliced bread. However, because Smalltalk didn’t support a switch I had to look for, and understand, how to overcome this deficiency. The correct answer is, of course, to use polymorphism and to make the objects themselves dispatch to the correct piece of code. Then I realised that it wasn’t a “deficiency” at all but Smalltalk was forcing me into much finer grained OOP design than I had got(ten) used to in C++. If there had been a switch statement available it would have taken me a lot longer to learn this or, worse, I might still be programming C++/Java pseudo-object style in Smalltalk.
Something I had never really thought about, but it’s truly the difference. I still am forgetting habits that I’ve learned in other languages to fully grasp the elegance of some things. As I’ve said many times, simplicity is the path to power.
[via Lambda the Ultimate]
No thoughts
In discussing the next generation UI framework for VisualWorks, the author commented that there were “over 20,000 [unit tests] currently, and they take about 10-12 minutes to run.” That’s a lot of unit tests. That’s probably more than I’ve ever written in my life, and I’ve written a lot.
I just can’t imagine. This is a testament to the author, as well as the environment that makes it trivial to keep adding tests incrementally and keep on top of them.
No thoughts
John Gruber (one of the great bloggers, in my opinion) writes about the issues with Applescript, concluding it’s an abyssmal failure
But saying what you mean, in English, almost never “just works” and compiles successfully as AppleScript, and so to be productive you still have to understand all of the ways that AppleScript actually works. But this is difficult, because the language syntax is optimized for English-likeness, rather than being optimized for making it clear just what the fuck is actually going on.
While I think he’s a bit harsh, and that Applescript has been used in a lot of cases to do things that are impossible on most machines, I do think that it’s attempt to model English, rather than abstract spoken language, and that makes people expect more flexibility than is really natural in a computer-parsable language. English has massive amounts of context that is both implicit and explicit—all of which are nearly impossible to duplicate in a computer language.
I think languages like Smalltalk, with a simple, regular syntax, are better suited to this sort of thing. There’s even F-Script on the Mac, which exposes all of the Applescripty goodness for automation.
The goal, rather than trying to make the user’s language “natural,” something that is still decades away—thanks Marvin—is to make it regular, predictable and comprehendable. In addition, it must be discoverable. All of these things are attributes of Smalltalk, and to some lesser extent (in the discovery domain) Lisp.
Power, and learning, come from simplicity, not complexity.
1 thought
Avi Bryan, he of the brilliant Seaside framework writes about interfacing objects with FTP. Brilliant stuff, but it’s not actually “new.” In fact, when I worked on Zope, we implemented an FTP interface into the object-database, which was amazingly useful. There are some issues related to it, but they’re all solvable.
5 thoughts
Over the past year or so, as I’ve gotten my teeth back into Smalltalk and Lisp I’ve had the excellent opportunity to work with some of the technical (and product) people at both Cincom and Franz. My experience has been beyond excellent.
I have had excellent discussions with people about minute detail of implementation, and they’ve provided feedback that has been excellent at moving ideas along. Offers of help with proof-of-concept have been offered, insight into product roadmap, and even honest discussions of where things don’t work, have been the hallmark of both organizations. This is a stark contrast with most companies.
All I can is both companies have some of the most brilliant people I’ve ever had the opportunity to talk to, and their insight is illuminating. It’s nice to see there are a few places where true technical excellent is still critical. Helping your customers be successful will always help you be successful.
No thoughts
So I’m working on some various parsing doo-dads, mostly in Smalltalk because it’s what I like most in life. One of the things that I’ve gotten used to in Python, Perl, Ruby, whatever, is the ability to quickly convert a delimited string—let’s say with my favorite delimiter, the pipe (|)—into an array of some form. For Pythoneers, it would myString.split('|'). Very simple, and nearly identical in most other languages.
Smalltalk doesn’t have this obvious functionality, but it does have upTo:, which will return a stream up to a specific character. More importantly, it has upToAndSkipThroughAll:, which returns the bit up to a specific character, and then skips over that character. Unfortunately, I hate how that’s spelled, and I hate having to loop over it.
So what to do? Extend the class model, of course! Ta’da!
splitOn: aCharacter expectedCount: anInteger
| aCollection |
aCollection := OrderedCollection new: anInteger.
[self atEnd]
whileFalse: [aCollection add:
(self upToAndSkipThroughAll: aCharacter)].
^aCollection.
Now, that’s not quite splitOn:, is it? Well, after doing some minor benchmarking with my application, I discovered that I was needlessly resizing the OrderedCollection. I happen to know how big it’s going to be, since it’s a regularly formatted line. So I can pass that in. If I don’t want to pass it in, I can just do this:
splitOn: aCharacter
"Answer back an OrderedCollection of the various tokens."
^self splitOn: aCharacter expectedCount: 25.
This is a common idiom in Smalltalk. Because all selectors must be fully qualified out, you can’t have “default parameters” as you might in Python. The idiom is to implement other selectors as chains to the most detailed one. This actually is, in many ways, more cleanly specified than Python, but it’s a bit more typing sometimes. Fortunately, I save so much elsewhere that it’s not something I sweat.
Now, I can just type:
tokens := aString splitOn: '|'.
Viola! I’ve now gained this on any Stream in the system. Try changing a core class in your ossified language of choice.
Nota bene: Ruby can do this as well, if not nearly as dynamically.
Ack, as someone on #smalltalk pointed out, this is actually already in SequencableCollection as tokensBasedOn:. I swear I searched, since I thought it was there, but guess it was late at night and I was dead. Oh well, still easy to add.
1 thought
Ruby is heavily inspired by Smalltalk. I’d say it’s about 50% Smalltalk-insired, and 50% Perl-inspired. Obviously, one can detect what my preference is in those inspirations. Since I’m talking about performance, a little anyway, I’d like to compare a bit.
Here’s the situation. I have a file, with 22,291 “micro-content” XML bits. Each root “document” has 29 elements that are interesting. All I am interested in is the textual content of these elements. The code for Smalltalk is:
firewallStream := 'ns_fw.xml' asFilename readStream.
[firewallStream atEnd] whileFalse:
[xml := parser parse: (firewallStream upTo: Character cr) readStream.
root := xml root.
newEvent := NormalizedEvent new.
newEvent xmlConversionMap keysAndValuesDo:
[:elementName :selector |
newEvent perform: selector
with: (root elementNamed: elementName) characterData]].
The code for Ruby:
File.open('ns_fw.xml', 'r') do |file|
file.each_line {|line|
tmp = NormalizedEvent.new.from_xml(line)
}
end
This uses this code, inside the class:
def from_xml(xml_string)
xml = REXML::Document.new(xml_string)
root = xml.root
XML_EVENT_MAP.each {|xml_form, event_form|
tmp = root.elements[xml_form].text
self.send(event_form, tmp)
}
end
The big difference is that the Ruby code I used a method inside the object to populate the instance attributes, as I honestly had no idea how to do this dynamically in a clean way. I’m new to Ruby, so this is likely not the optimal way to do it. Honestly, they both do the same basic thing, just moved the code inside the class for Ruby, which should make it faster.
Time? 0.89ms/iteration for Smalltalk, 24.7ms for Ruby. Both are using “native” XML parsers, not wrappers for some other library. For Ruby, this is REXML. Definitely a difference in maturity for the VMs in both places. This is not to say Ruby is bad, it’s probably roughly on par with Python and Perl, but it’s not quite in the league of Smalltalk (or Lisp, likely), which have extremely mature VMs with on-the-fly compilation and optimization.
Remember, Java’s JVM (HotSpot) came from Strongtalk, an explicitly-typed version of Smalltalk.
And for those who have read my red herring of performance comments, and wondered if I’m a hypocrite, the answer is yes, but not in this case. I happen to know from profiling that this component is used tens of millions of times, representing a large percentage of the time consumed, and worse, in a real-time fashion, so performance matters in this case.
Sadly, this is not something I can fix by saying “go away, damn XML atrocity,” because it’s something dictated by entities outside my influence.
7 thoughts
In working on some Smalltalk code (in VisualWorks) on my Mac, things seemed to be running slower than I thought they should. What I was doing was taking a stream of data that represented a bunch of XML “micro-content” and parsing it, then turning it into a new object:
xml := parser parse: (inputStream upTo: Character cr) readStream.
root := xml root.
newData := NormalizedData new.
newData xmlConversionMap keysAndValuesDo:
[:elementName :selector |
newEvent perform: selector
with: (root elementNamed: elementName) characterData].
What that does, for those unfamiliar with Smalltalk code, is read a bit of a stream, up to the end of the line, then turn it into another stream and had than to the parser. That’s boring. The interesting part is the keysAndValuesDo: selector. It runs through a mapping of XML element names to accessors in the new object, and dynamically calls them with the provided data. Perhaps not the fastest, but for me, the most obvious and cleanest way to do this.
On my 1.8Ghz G5 iMac (2Gb RAM), it takes 2.4ms to convert one iteration. On an Athlon64/3000 box, which is 2Ghz and 1Gb RAM, it takes 0.9ms. While some of this difference is in actual processor differences, most of it, I suspect is in the JIT compilation inside the VM. Obviously, the x86 platform is doing a much better job. One hopes that with Apple’s migration to Intel, this will become a non-issue.
One nice thing, though. To do the comparison, I simply moved my image from one machine to the other, and was therefore able to truly do a comparison.
No thoughts
From Techworld:
Mahindra British Telecom (MBT) said on Thursday it is putting the finishing touches on software that almost completely automates the process of converting legacy applications written in languages such as Cobol, Pascal, Delphi and Smalltalk to modern languages such as C, C++ and Java.
Since when are C and C++ (or Java) “modern languages”? C dates back to BCPL and the late 60s. It’s hardly modern. It’s really a glorified macro-assembler in its real form. C++ is simply a macro-assembler obfuscated to a level unfathomable by humans. Java just makes it slower.
Conversion can currently require armies of programmers and man man-years of time; MBT claims its automated system reduces the human factor by 90 percent, and ensures “zero-error” quality. All this is possible because any programming language can be abstracted to a few fundamental principles, MBT said.
Mmmm, so let’s see how you abstract my meta-model, like something in use in at Kapital at JP Morgan. The reason it can be abstracted to such drivel is that most software is crap. It doesn’t exploit the capabilities of the language, the models it can create. I think it would be hard to convert a well architected LISP application to Smalltalk, much less to C++ or Java. It says more about the poor state of software architecture that it is so simplistic than it does about what is possible.
Maybe you can convert 90% of the code. Perhaps. But is it maintainable? Or is it some horrid thing like what the old C++ preprocessors used to spit out that is unreadable and unmaintainable. Just converting my reasonable COBOL code to C++ that’s unmaintainable isn’t much of a service.
1 thought
From a blog entry about closures and blocks in Java:
I freely admit that I have never written a single line of Smalltalk code. In spite of that, I do understand some of the benefits Smalltalk has over mere mortal languages such as our beloved Java.
Indeed. The thing is, I find real human beings, as opposed to the automatons that pass for most “developers” (code monkeys is more accurate), have a penchant for working with things like this. One of the thing that attracts me to Ruby over Python any more (although I admit I’ve not made the switch, largely because of documentation and libraries) is its inclusion of blocks. Blocks just make sense.
The problem with the solution of putting it in a “library” is that a closure or block is truly a core component of languages, not some add on. It will never work right or cleanly, or quickly, if you bolt it on like some kind of syntactic curb-feeler.
1 thought
James Robertson talks about threading in Smalltalk, which was spawned by some discussion on the mailing list about threading, and more importantly taking advantage of multiple processors. For me, there’s two things being discussed:
- Threading as a programming model
- Taking advantage of SMP/processor threading
James really only addresses the first concern—how to use threads for concurrent programming—which is a major issue, and is best managed through green threads in my opinion. It is also useful for dealing with I/O issues in a more easily managed asyncronous manner by spawning threads.
Unfortunately, it doesn’t deal with the ablity to utilize multiple CPUs, or CPUs that have multiple threads internally (P4, POWER5). In that case, it requires an OS-level thread/process. What would be useful for me is a very lightweight way to send messages in Smalltalk between processes on the same machine, perhaps using shared memory or named pipes, allowing high-bandwidth and low latency.
No thoughts
and smoke it… Honestly, I’m not sure how serious he’s being though… it’s late.
We all know that most of the memory in a Smalltalk image is taken up by Instances of objects. A large saving can be made if we stop creating new instances of classes. This has other advantages too. It makes Smalltalk more like Self.
Michael Lucas-Smith then goes on to explain how to do exactly that and why one might want to do that. Try doing that one with Java, or C++, or any other “object oriented language.” It’s not that you might want to flip the whole thing on its head and make it a template-based language, like Self, it’s that you could.
Of late, I’ve been back mucking with Python code, for various “political” reasons internal to my company (Smalltalk is just too scary to some people), and I miss some of the bizarre elegance of Smalltalk, and definitely miss the browser and re-factoring skills. WingIDE is nice, but it ain’t VisualWorks.
Update: Turns out Mr. Smith wasn’t serious, but the fact that you could do such insane things was amusing.
No thoughts
Travis Griggs talks about memory management and performance, and this reminded me of something we ran into at work. A specific component of our system is written in C++ and is very performance focused. One of our customers ended up pushing it harder than we’d ever been able to in the labs and it kept pegging the CPU for no known reason. Eventually, we tracked it down to a memory management issue.
The software was originally written with a basic C-style malloc structure in it. This worked fine up to about 80% utilization, but started breaking down because we were allocating and freeing memory so fast that it was trashing the VM system in the OS. By changing strategies to a block allocator, we were able to seriously increase performance and stability.
This to me reminds us that we shouldn’t worry about optimization until we have numbers to tell us where to look. The bottleneck is rarely where you think it is, and sometimes it’s deep in the intricacies of your code’s interaction with the rest of the world.
1 thought
So as I’m inverting the concepts for some basic code I wrote (a file watcher, which basically calls code whenever a line is appended), and it involves spawning processes so that the IO can happen asynchronously, as there are a lot of delays and idle time. Anyway, I messed up the logic of a piece, inverting the idea of less and greater-than. Simple mistake, but it threw the image into a “dead lock” state since the process wouldn’t relinquish the CPU. Hit Control-Y, which interrupts the front process, diagnose the problem, unwind the stack, fix the code, hit “go” and let it run correctly.
No need to get back to where I was, just let the system unwind and keep going. Between that and being able to highlight some code I used in a couple places, right-click and select “Extract to method” and fix up all the code automatically. I’m slowly getting into the swing of everything being a method, refactoring down and having methods that are on average just a couple lines of code.
No thoughts
So, talking on the IRC channel about “bloat” since Smalltalk was created, we thought about the fact that Eclipse is as “close” as you can get to a Smalltalk environment, so how big is Eclipse compared to VisualWorks? So let’s separate the two as much as possible:
- VisualWorks Object Engine (i.e., VM): 690K
- VisualWorks “image”: 18Mb (including all development tools)
Please note that that includes tons of code, the entire source of the system, refactoring browser, implementations of crypto, the GUI, etc., Now for Eclipse:
- Sun JDK 1.4.2 for Linux: 65Mb (just runtime)
- Eclipse 2.1: 80Mb
So, let’s see… VisualWorks is about 20Mb including everything you need, once it’s been loaded from a Store repository or parcels. Eclipse is about 175Mb, or about 9x larger. Sad.
Yes, I know I’m not comparing apples and oranges, but honestly, VisualWorks has things Eclipse can only dream of, and I’m sure there’s things in the Java world that are included that I’m not counting against VisualWorks, but let’s be real…
4 thoughts
I’ve updated the version in the public repository to 0.2, and performed the following things:
- Renamed
RemoteUDPSyslogServer to UDPSyslogServer to better reflect it’s use
- Refactored functionality out of initial implementation to
AbstractSyslogServer
- Implemented a
UnixDomainSyslogServer that is designed to take a UDSocketAddress
The total refactoring took about oh, 2 minutes. Unfortunately, I tripped on what I think is a Refactoring Browser bug… when I pushed up some class variables, the initialization code disappeared. So boom, I decided to rewrite them as lazy initialization code instead.
To use the new one, use code like this:
localSyslogServer := UnixDomainSyslogServer
newAt: (UDSocketAddress name: '/dev/log')
for: #local0.
It’ll fit on one line, no big deal, but I don’t want to screw up wrapping on this page. After that, it’s the standard interface from before.
For those parsing the syslog files, you will notice that most syslogd daemons treat local messages differently than remote messages. For example, I tell syslogd not to resolve IP addresses, so the messages will have ‘192.168.1.1’ in them if they came over UDP, but local messages will always have the host name of the server, regardless of what options you pass it.
No thoughts
I just published my first “public” package, called OSLogging, to the Cincom Public Repository, which is just a tiny class I wrote to send the correct formatting of UDP packets to a syslog daemon. currently stands at version 0.1. Why so low?
- No unit testing has been written, as I’m not sure how to unit test network code in this specific case, as you don’t get a response back.
- No testing has been doing with any other platform
- The file writing logger is not done yet.
The basic API is very simple. First you’ll need to create an instance to work with:
syslogServer := RemoteUDPSyslogServer
newAt: (IPSocketAddress hostName: 'hydra.amber.org' port: 514)
for: #local0.
You’ll note that I’m using symbols for the various facilities, and for the severities, though the latter are largely hidden. You can also just use newAt: and it will assume local0. Then, to send messages, you just do this:
syslogServer error: 'Error message here'.
That’s it. The various levels are mapped to different messages. By default, the tag that is used (i.e. the process name) is the name of your image. One final security note: The code will truncate all messages to fit in 1024 bytes, as the RFC requires. It does this silently, currently.
The goals for version 1.0 are:
- Implement unit testing in some useful fashion (useless tests are just that)
- Implement TCP syslog protocol if possible, but need to research whether it makes sense in non-relay situations
- Finish the file-level logging with rotation as necessary
Please feel free to make all the requests in the world, or suggestions, or code critiques. This is my first real public Smalltalk code, and I’d appreciate any style suggestions.
No thoughts