Tuesday, August 29, 2006

Darcs is broken



Darcs is broken. I found it out the hard way yesterday, while I was trying to commit my patches to the main trunk of GHC.
Essentially, my patches were living in a separate branch. Of course the Darcs model does not need the concept of branches, as traditional SCMs do, but the analogy works fine here.
The debugger is complete just in time for the 6.6 GHC release, and we were planning to include it in the candidate release yesterday Monday.

But Darcs planned otherwise. While my patches were on a branch, I had been careful to keep this branch sync'ed to the main trunk. This means pulling patches from the main trunk once a week or so and fixing any conflicts with my code. Since my patches are spread around several subsystems in the compiler, conflicts have arisen three or four times during these three months.
It turns out that since the last sync with the main trunk I did around the 22nd of August, a new patch has been commited which conflicts with my work. But this time Darcs has decided against simply doing the merge and calling conflict. Instead, it will diverge exponentially and hang.

I have tried everything! I switched to Linux, to Windows, to older versions, I tried all sort of Darcs tricks... to no avail.

What is more embarrassing, it turns out that this is a well-known issue of Darcs, and looks like there is no hope for a solution in the near future.

So now what? The maintainers of the Ghc repo are aware of this Darcs issue, and they have sensibly requested me to avoid pushing patches with conflicts into the main repo. Which means I have to manually re-record all my patches, by hand (with help of diff/patch), into the main trunk. This could take me several evenings, and is boring as hell!
Of course I could record a One Big Patch, but that's a less desirable solution.

I wish I had known about this issue of Darcs before.
I wish Darcs patch theory wasn't so broken in the first place.

[posted with ecto]

Thursday, August 03, 2006

SoC demographic stats

The Google people have published some numbers about this year Summer of Code.

I am still jaws open. Haskell.org is ranking above organizations such as The Mozilla Foundation, Eclipse or Ruby Central !!

How would you interpret this?
Could it perhaps be an indication that Haskell and FP in general is starting to become more popular among the masses, at least young masses?
Or could it just mean that Haskell.org project proposals were terrific -they were, imho- and attracted a lot of minds?

In one way or another it is awesome, and I am really thankful to all the people in Google and in Haskell.org who have made this possible.

Tuesday, August 01, 2006

Feather by feather

I bought me a white Macbook a few weeks ago and have been loving it since the first day. This is my first Apple box and I was a bit afraid that the transition would be time consuming, but I have actually enjoyed it.
Some articles I recommend for tech-savvy people who are in the same situation - Macs are taking over the world nowadays - is the awesome Ars Technica collection of Mac Os reviews.
Even though this has been a great experience and I am loving Mac Os - the hardware is good too, but nothing gets close to a Thinkpad - the system feels very slow most of the time. I use to have several applications running - Emacs, Mail, iTunes, Firefox, Terminal.App, Preview, Colloquy, Skype... - and a perennial ghc compilation process. I'm not sure if this is solely because of the limited amount of RAM the Macbook has (512 mb) or if this has to do with Mac Os 'universal cache'. I say this because it is especially noticeable when I launch a compilation session, often the system becomes totally unuseable - I cannot even type in a console-.

So a major upgrade was in order. A new 7200 rpm hard drive and two 1gb sticks are arriving tomorrow, and I am getting ready for the migration. Sadly I can't use Tiger shiny Migration Assistant to help me, since it requires two Macs and I only have one :S

I'd loke to minimize the time used to install and setup the new hard drive, but also would like to do it from scratch in order to fix the beginner mistakes I have fall in, which include:
  • installing Stuffit, which is a PowerPC binary, will screw all the file associations for archive types
  • xompletely messy iTunes library
  • too many unix packages installed all around. I have used both fink and Darwin Ports, and it seems that a lot of hard drive space is wasted in overlapping base packages. Heck, I even went as far as to compiling amaRoK with fink (and all the base packages it requires!)
  • messy ghc installations all around
  • installing several softwares I didn't end up using
The list below is a self note of all the things I have installed and I want to keep:
  • Carbon Emacs (my .emacs sets up Haskell mode, hippie-completion and a few other keystrokes)
  • Darcs + Darcsweb + setup my httpd
  • of course Firefox. Sure, I like the looks of Camino, but it is not clear to me why would I want to sacrifice all the available Firefox extensions
    • Google Notebook + Google Sync + delicious Complete
  • Missing Sync (demo). Otherwise, iCal will be useless for me.
  • Colloquy
  • Adium (with this fix)
  • Skype (webcam beta version)
  • Dashboard plugins:
    • Wikipedia
    • Sing that itune
    • laTele
    • cloudLicious
    • Airport Radar
  • Spotlight plugins
  • iTunes plugins
    • Synergy (demo)
    • iScrobbler
    • iTunes Catalog (demo)
  • Textmate (demo)
  • XNJB proved pretty handy to manage my brother's Zen
  • Azureus, DivX
  • Omnigraffle, MindManager (demo)
  • Sleepless
  • TexShop + teTex (via fink)
  • XCode tools + Apple X11
  • Virtue Desktops
  • UnrarX
  • rdekstop (via port)
  • Chicken of the vnc
  • subversion (via port)
  • Parallels -unbearably slow with this little memory-. I only need it to tinker with Visual Haskell
  • MacPython
  • Alarm Clock

That's all, and that's enough.
This time the tribute is for Smog.

Thursday, July 06, 2006

Post war

Oh no. Look at this. Almost one month without posting. Look at all you have missed! I feel your pain.

I have been so busy working and coding that I completely forgot about my blog. That, or that I hate blogging. This is something that puzzles me. I love reading blogs, I know that. I especially love good writers, so maybe it's a problem of fear of comparison. Steve Yegge is a great writer. Much better than Paul Graham -I won't say sorry Steve, I know he is not reading this- . I especially dearly love the witches one, it reminds me of House. I've also been watching House a lot lately. Much better than the over hyped Lost or Prison Break if you ask me.

But honestly, during this month I've been doing some work too. I have actually managed to put up some patches -sorry, you missed my 'first patch' post- and to get myself exposed in the public -you missed that one too-. Don't worry, it is all neatly included in a very nice page at the haskell wiki.

That last part is actually interesting. If you don't mind, since it took me a few minutes to compose it, I am going to call the holy powers of reuse and paste a nice info bit about the closure viewer from there.

Currently it provides two new commands under ghci, :print and :sprint, both used in the same way as :type or :info. The latter prints a semievaluated closure using underscores to represent suspended computations (pretty much as [[Hood]] does). The former one in addition binds these thunks to variable names, so that you can do things with them.

Example:

Prelude> let li = map Just [1..5]
Prelude> length li
5
Prelude> :sp li
li - _:_:_:_:_:[]

Prelude> head li
Just 1

Prelude> :sp li
li - Just 1:_:_:_:_:[]

Prelude> last li
Just 5

Prelude> :sp li
li - Just 1:_:_:_:Just 5:[]

Prelude> :p li
li - Just 1 : (_t987::Maybe Integer) : (_t988::Maybe Integer) : (_t989::Maybe Integer) : [Just 5]

Prelude> _t987 `seq` ()

Prelude> :p li
li - Just 1 : Just 2 : (_t457::Maybe Integer) : (_t458::Maybe Integer) : [Just 5]

Prelude> _t988
Just 3


Its best feature is that it can work without type information, so you can display polymorphic objects the type of which you don't know. However if there is type information available, it is used. It could be made totally independent of type info, so that it could work with opaque or coerced (wrong) types. For instance:


data Opaque = forall a. O a



*Test2> let li = map Just [1..5]
*Test2> let o = O li
*Test2> head li `seq` ()
*Test2> length li `seq` ()
*Test2> :p o
o - O Just 1 : (_t126::a) : (_t125::a) : (_t124::a) : (_t123::a) : []


In the example above the li inside o is not typed, so the bindings aren't either. However, it would be possible to extend the closure viewer so that it recovers its types.

Other currently proposed extensions are a safeCoerce function (not so useful, it depends on ghc-api) and an unsafeDeepSeq (this one is decoupled from ghc-api). There is also a generally useful (for compiler/tool developers) isFullyEvaluated query function. The signatures being:


isFullyEvaluated :: a -> IO Bool
unsafeDeepSeq :: a -> b -> b
safeCoerce :: GHC.Session -> a -> Maybe b


That's all for this one. Some of you people have been annoyed that my feeds are not working. Don't worry, I will be fixing them soon.

Saturday, June 10, 2006

Short Update

In the style of a fellow Summer Haskeller I enumerate below my current headache sources.
  • C-- I got two prim ops implemented in GHC, one is for retrieving the info table pointer of a closure and another to retrieve the payload. All in all, C-- coding withouth knowing C--
  • Dealing with Pointers I've been reading the code from FPS to get a better idea of how to work with this kind of stuff. The payload of a closure is retrieved as a tuple consisting of an array of pointers (to other closures) and a bytearray. Skimming through the FPS code (now called Data.ByteString) helped me a lot and now I got the bytearray side of the tuple sorted.
  • Debugging the beast Maybe in an exercise of naivety, I hoped to be able to complete the project withouth resorting to gdb. After all, it debugging the beast is well known to be scary. The current situation is that if well the bytearray thing is working, I cannot say the same about the array of closure pointers. All I got for now are segfaults and no clue.

If you want to know a bit more about what exactly I am working on, follow this discussion in the Glasgow Haskell Mailing List.

On a side note, Simon Marlow has set up a Darcs repository at darcs.haskell.org for the nine of us. I haven't still given it a thought, but probably I will have to branch the entire GHC repo in there. I'm not sure if that's the right thing to do.

Saturday, June 03, 2006

Bitten by rsync

My tool of work is a Thinkpad X31, an impressive machine which still does its job after three years of extreme use. I've tried to install Linux on it several times, but I always end up dual booting on Windows: I'm not buying the Linux laptop experience.

Building GHC is an fairly complex process by itself, and setting the build process in Windows would probably be a nightmare. That's why I'm using a Linux box to build it. I make the modifications in my laptop and use Darcs to transfer them to the compilation box. Sounds ok.

Now, the annoying part is that I often have to correct a few things in the code to fix the ubiquitous compilation errors, and I do that directly on the server (emacs through SSH is awesome). Then I amend the patch and get it back in my laptop (amend-unpull-pull) which is a royal pain. It is even worse now that compilation times are huge (usually they aren't, but I'm working in some prim ops now and that requires almost full recompilation). I find that I'm coding in my laptop while the box is compiling, and once the amended patch compiles and is tested, I get a lot of conflicts with the current changes when pulling it back to my laptop.

So I seemingly have two options. Either 1) I do all the work in the server through SSH and Emacs, or 2) I do it all in my laptop. I'm not convinced by 1), and I have discarded Linux in my laptop and building GHC in Windows for 2).

This morning I came up with the idea of using rsync to enable 2). I can do all the work in my laptop, including correcting compile errors, and rsync helps me to get almost instant feedback (modulo GHC compilation times). The good thing of this is that I can avoid producing incorrect patches which I have to amend every time, and avoid the conflicts I get when pulling them back.

Well, so far the experiment has been a disaster. I wanted to go safe, so I took my time to read a few tutorials on rsync, the man page, set up a rsync server in my build box, carefully set the appropiate filters to synchronize only the appropiate extensions (hs,lhs,c,h,cmm..), exclude the _darcs directory and so on.... A few dry runs after and everything seems to be all right, so I jump and launch the actual thing.
ZAP! In a second I've lost two hours of work in my laptop because turns out I used rsync in the inverse direction, i.e. overwriting the files in my laptop with the files in the building box. And yes, dumb as I am I hadn't recorded those changes in a Darcs patch beforehand, and there is no way to restore them back.

I can redo that work in a few minutes, it's not a big deal, but right now I feel dumb and pissed!

UPDATE: I got it right at last, and now it is working beautifully. I've setup a two lines script allowing me to do the whole cycle edit -> compile -> edit ... from my side. Another handy trick for doing that was the ssh feature for sending a command remotely.

Sunday, May 28, 2006

A first tiny step

Yesterday night I managed to do it: modify the beast, compile, and test my changes!

The goal was to obtain the closure type of a HValue, the internal datatype representing linked BCOs (bytecode objects) used by ghci. It was a bit annoying to get there, because it involved
  • unsafeCoerce# (aka the Blunt Instrument)
  • fiddling with pointers
  • and going down to C-land.
But it was actually a piece of cake!

Visit my darcs repository if you want to see the code

Saturday, May 27, 2006

I should be coding ...

...instead of posting here, but I guess I can use a few minutes to briefly introduce the SoC project I am working on.

Haskell is a beautiful, lazy functional programming language, and GHC is its compiler. We (that's me and my headache) are working in a easy to use, simple, bare to the bones debugger based on GHC, or GHC-interactive to be more specific. How come there isn't a debugger for Haskell yet?

Well, of course there are debuggers, and they are far more advanced than the one we are aiming at. See for instance the Hat tracer. The main problem is that they are outdated, since they cannot keep up with the evolutional speed of Haskell. Haskell 98 is the standard, but every new version of GHC sets the current status of the language, implementing several (lots of) experimental extensions. Haskell research community is enormously active.

As a result, you are restricted to a (bigger or smaller) subset of the language if you want to use one of these debuggers. Our goal is to build on top of the very own GHC machinery so that our debugger will never restrict you or get outdated. And to keep it simple.

Wish us good luck :)
(ok, I swear the jokes about my headache end here)

PS: Just to clarify, the debugger is not to my credit. I am only doing on a tiny part, building on top of the work of David Himmelstrup, my mentor (poor soul).

Friday, May 26, 2006

A Hell of a week

This one was an all around shiny week which started all wrong. On Monday I had a pretty serious bike accident. In fact I can't remember anything about it: I hit my backhead hard on the ground and lost my conscience for a few seconds. Luckily there were gentle peasants around to help me, and I managed to regain my feet with a few bruises, an almost-broken rib and sore knees.
Because of the head trauma I was feeling some strong dizziness periods, so I have spent some time in the hospital passing tests, until yesterday they found that all I have is this. Phew! Good to know that my brain survived! (though it is the same brain that had a stupid bike accident on Monday in the first place)

On Wednesday morning, before my bike was stolen some hours later, I got accepted for Google Summer of Code, on a very interesting project for Haskell. La ostia! The funny thing is that many people in my research group found it out before than myself, reading it in a post to the Haskell list. It rocks to have this sort of group mates!
I'm a bit scared with the perspective, because until I get some vacation time in July time will have to be shared between work on SoC, PhD courses and PhD work, so it is going to be a tight schedule.

On another topic entirely, my dad is progressing well with Rails. It's an impressive feat (for an old man like my dad ;) ) that in only one month he is speaking Rails fluently, considering that beforehand, he knew nothing about HTML, CSS, Javascript, and the entire web development problem. I'm lucky to be a percent as smart as he is. Also, he just admitted that diving through the Rails sources has helped him immensely. He is starting to appreciate the whole point of open source, and for an old man like him who used to deny OSS, that's a huge accomplishment.

That was all. Not bad for a first post. I will try to post at least an update a week, though I know beforehand that's an empty promise.

Saturday, December 10, 2005

Here I am




















That's me with a funny face, in a fencing suit