The Avatar ([info]saurik) wrote,
@ 2004-04-02 16:42:00
Previous Entry  Add to memories!  Tell a Friend!  Next Entry
C++, C--, and CCS
(Local time: 02:42.) Sometime in the middle of tomorrow I'm going to go through here and edit this to add the names of the talks as I managed to leave my schedule at Murat's hotel so I don't remember exactly what everything was today. :(

I ended up sleeping in a little this morning and then spent an hour working on the Internet rather than heading out for talks, so I ended up showing up almost at the end of the second actual talk today (not counting the invited lecture), "?elimating partial redundant flow graphs?". It had something to do with finding redundant code sections that weren't lexically or even instructionaly identical but still happened to evaluate to the same values. They compared their technique to something called Global Value Numbering, which got mentioned by someone else later in the day, but I unfortunately can't say much about as I'm not sure what it is.

The next talk I saw was on "?scalar replacement?". It was done by a rather well spoken graduate student and covered taking rather complex looping structures and removing all possible redundant loads from them. This technique works not only for things that can be brought out of a loop, but values that are used, for example, during two consecutive loop runs. Imagine an array implementation of fibonacci: each loop reads two elements and writes one. If you burned two extra registers on it, you could half one register for the last two entries in the array. Then you add the second onto the first, store that value to memory, and swap the registers. Repeaing this would perform the entire sequence with no more loads.

Taken to the extreme, however, this can burn many, many registers. He was showing it's usage mainly for FPGAs or other architectures were you might have 64 registers available for doing work. Then you might take multiply nested loop structures and a set of 30-some registers and keep rotating the registers in a loop feeding new data in on one end. He had quite good numbers, but people were questioning how he measured the number of registers that were used by the various algorithms tested. I can't say that I entirely understood or even remember what little I did, what the issues were that people were complaining about.

After this was a talk on optimization of value-type genericity. In many generics implementations, rather than doing multiple expansions a la C++, they will simply copy all the value-types to the heap and then have a single instantiation that really is internally working off of Object and doing casts along the border interface. (Java 1.5 is an example of this scheme.) The speaker (a graduate student) is just finishing his thesis work on writing a small object-oriented language that had generics that worked in this fashion but did lazy boxing in order to gain a performance advantage when the boxing operation wasn't actually neccessary (such as you declared some value types on the stack and then called some generic object implementation to do some work on them before leaving your stack frame, where the algorithm died before the objects did).

He managed to get some interesting speedup numbers with a single case of slowdown that was explainable to something he simply hadn't gotten around to implementing in the algorithm (i.e., it was fixable). It worked by allowing stack pointers to data earlier on the stack without doing any copying (as that is a safe operation as the pointer will unwind before the data does) and only doing box operations when the pointer is stored into the field of some heap object. I had a complaint involving his algorithm causing multiple copies of the object to exist behind the backs of people operating on it, but I got a chance to talk with him a lot afterwards about the implementation of value-types in various languages (he, being a graduate student, ended up being a nice person to talk to), and he was making an assumption (that was in his paper but that he had forgotten to mention in the talk) that value types would be immutable that I wasn't (my coming from a background in C++ and low-level .NET, where value types are definitely mutable).

It turns out the main developer of C--, Norman Ramsey, is at this conference and seemed rather interested in what garbage collector this language used and had a few issues that were brought up during question and answer period and then that he followed up with after the talk was over. In general I got a rather negative impression of Norman. When I asked my question involving the semantics of the object copy operations he was making this kind of disgruntled shaking of his head motion (he evidently was also in the "immutable value type camp") and then he kept being rather actively hostile towards me afterwards while we were all talking to the speaker. Example: he made a comment about usage statistics tracking in Boehm's garbage collector and how it might have it, so I commented that Boehm actually did just implement that about a month and a half ago for the gcj project and Norman gave me an evil look and just ignored the comment. He spent most of the time I saw him going on about how wonderful C-- is and how everyone should be using it for all their compiler backends... it was just irritating how he kept coopting every discussion into that.

I also got to talking with another graduate student (one who was also talking with the same speaker) afterwards and walked with him to the evil sandwich boxes (which I didn't even bother to get today). He asked what I was working on and why I was there, so I gave him the entire explanation of what jMonitor was and how it fit into everything and how it was different than just aspect-oriented programming (as he asked about that and I consider it a general concern of the work). Once again, graduate student == nice. The pattern just goes on like this for quite a while. After he got his sandwich box I split off and went to eat lunch at the same little on-campus restaurant I did yesterday and got the same meal.

After lunch I went to the SPIN conference for a while to listen to a few presentations of SPIN-related tools. I really didn't get much about of this as I'm not nearly enough into model checking for it, but it was at least fun to watch the first guy talk (the second two were rather boring to me, and one of them completely seemed to not understand memory maps even though they really should have been considered crucial to the heavily file-as-array I/O that he was doing). Turns out one of his unexplanables in his graphs probably got explained by one of the people in the audience who had seen similar kinds of results before in things that he had done. The talk was on a parallel model checker where states of the program were checked simultaneously on multiple computers, but there was a weird distribution of work loads that would cause most of the CPUs to finish long before the last few did, thereby causing the time to be drastically inflated. The explanation was that there was a problem with the hashing algorithm causing too much locking between multiple computers due to some particular quirk in the particular algorithm that it seemed only model checking people probably used, so I can't explain it, hehe.

The craziest part of that explanation, though, was the beginning of the talk describing how the tool was "extensible". I actually couldn't tell if it was a joke or not. The gist was that it was all written in C++ and if you wanted to extend anything you A) should think about it first as it might be a waste of time, think for a few hours to save weeks worth of coding, B) determined what code you needed to modify, C) derived a class from the class that contained that code, D) added the virtual keyword as required to make the methods you needed to change overridable, and E) override the methods with your new implementation. So in other words: you retrofit the codebase as you would any other C++ program to use inheritence. No one laughed, though, and he did it with a perfectly straight face, so I don't think it was a joke... oh well.

When the tool presentations were over I ran back to the Compiler Construction conference and caught the end of a talk called "" on what sounded like (from the very end of the talk and the _long_ q/a session afterwards) a CPU architecture that used much less CPU power but sometimes caused errors and didn't check important conditions. I wasn't sure if it that was because it would actually run out of power if too many wires were active or what, but he got reamed by the audience. Some of the complaints weren't all that valid (and actually caused small arguments within the audience), but a killer one was the author didn't understand what random number distributions were. All his distributions were uniform, which when interrogated he said was fine as that's why people used random number generators for these kinds of tests. I didn't see the talk, but I think the issue might have been that the architecture would rely on not needing too much power for too long a time, and if you got unlucky and kept getting rather large numbers you'd get very screwed very fast.

The talk after this was by someone from Cambridge. The talk was done with good old fashioned projector slide technology. The work was covered by a grant from ARM, but I guess they didn't have enough money for a laptop. This guy seemed interesting but he was pretty much reading his slides which got very boring very fast. His work was on an optimization for the "ARM Thumb" CPUs (a compressed instruction set version of ARM) to better utilize a specific instruction that lets you do multiple load/store operations from consecutive memory operations at once into monotonically increasing registers. (So like load the memory from 4h to 12h into registers 4, 6, and 7.) He managed to beat gcc by a bit on code size (which is the main reason people use the Thumb and therefor the most important consideration) and usually beat the ARM commercial compiler. However what was weird was that his optimization seemed to only save him a few instructions. I hadn't even noticed that when he went through his talk but someone in the audience did and asked him about it. I think the audience concensus (he disagrees) was that he just got lucky with his register allocator algorithm and chose one much better than gcc or ARM and therefor got good instruction counts regardless of this optimization.

During the coffee break I headed back to the Vertex building in order to pick up a copy of a book I had bought a few days earlier at the Springer publishing stand. I ended up getting into a rather long conversation with the guy working there, an editor for Springer named Ronan Nugent, about education and programs like CCS. He was quite interested in it and felt it was a program that fostering creativity was incredibly important, and the kind of atmosphere I was describing at CCS would really only end up happening in America due to silly social aspects in academia in Europe.

Another person I spent time talking to about CCS was Mark last night. He apparently went to a liberal arts college in Washington named Evergreen that had a somewhat similar system involving students doing a lot of independant study work and being able to build their own curiculumms. I find the story of CCS tends to interest a lot of people. :)

I headed back to CC then to find out I had spent over and hour talking with Ronan and had missed the next talk and most of the last talk. I went in and sat next to Mark, who was falling asleep. The presentation was on using Lua to due dynamic stack frame allocation, but as I missed most of I don't really understand what the usage cases are. It's technology that has been integrated into C--, however, I do know.

When that talk was over and the CC conference officially concluded I headed back to the computer rooms until they closed and the last few of us got kicked out. One of the other couple people was the person whom I thought had worked on C-Front from the day before (who had been talking with Scott). I took the opportunity to start a conversation with him. It turns out I had misheard that conversation snippet and he was working on the gcc C++ compiler. I had a _lot_ to talk to him about and we got along really well. His name was Gabriel DOS REIS (his capitalization, not mine) and he's been doing some papers with Bjarne Stroustroup about the future of C++ and where it's going. I brought up some of my really weird issues I had with using templates in hairy situations and we traded some examples of intersting C++isms back and forth.

Now, you might way to jump and say "Hah! A professor you get along with!", but my response will be "He isn't a CS professor, though; his work is in differential geometry. So there.". So now I only believe that _CS_ professors who attend this conference are generally snooty and mean. :) It turns out he got into C++ because no one had a good implementation of std::valarray a while back (a super, super fast templated array type designed for usage in numerical computing and largely added to the standard as a response to Fortran; an addition that largely got _ignored_ later because all the numerical computation people on the standards committee stopped contributing and never finished some key standards work that needs to go into it before it is actually useful) so he implemented it and gave it to gcc. While doing that he kept trying to use partial template specialization for various reasons and kept breaking gcc so he then moved to improving the compiler.

He asked for my e-mail address and I got his, so I hope we will be talking occasionally. When I get back to SB I'm probably going to go through and write up a list of all the issues I currently have with C++ that I haven't seen addressed by enhancement proposals yet and see what he thinks about them.

One thing I've been really irritated with is a recent issue I've had (and spent some time describing to Colin) involving the inability to pass template templates through standard templates without coding up a combinitorial explosion of different implementations for different number of template template arguments. It turns out that the new usage of the "auto" keyword I had already read about is, in even newer proposals (specifically that he and Bjarne had done), capable of solving this problem. When I get back I'm going to be reading over those papers. I had actually seen them but didn't get around to spending time looking at the details of them.

After Gabriel and I parted ways I called Murat again (he hadn't been in his hotel room earlier) and arranged to meet him there to do some preparatory work for our presentation tomorrow. I got on the subway and met him. Talking to a familiar person in English was _so_ nice. :) I can't wait to get home, hehe. I talked with him up until when I absolutely needed to leave for the SPIN dinner I had registered for, and then got directions to that from the hotel desk downstairs. I decided it was far enough away that I got a taxi to go there.

The taxi ride was the most enjoyable I've had. The driver and I spent most of the time talking to each other. I was talking to him in broken Spanish, sometimes furiously looking things up in my English to... Catalan (not Spanish I noticed)... dictionary, and he was speaking to me in broken English. He is 35 years old and his name is Gabriel. :) Our conversation was largely us telling each other how horrible our learn-a-foriegn-language experiences had been. For me it was my whole story of spending 4 years on French and switching to Spanish. For him it was the corriculum switching out from under him, changing from French to English. ;)

Ok, I'm going to cut it here before I describe the dinner as I'm really tired and I want to go to bed. (Yes, I realize I'm falling behind, but tomorrow is the end of the busy part of the conference so I'll be able to catch up in time for the end of the conference the day after, hehe.)


Create an Account
Forgot your login or password?
Login w/ OpenID
English • Español • Deutsch • Русский…