David R. MacIver's Blog
Computational linguistics and Me
Apparently I’m a computational linguistics blogger. This is sortof news to me. The closest I’ve come to blogging about computational linguistics is in writing a borderline rant about academia.
That being said, I do work in computational linguistics: SONAR is basically a great big NLP system.
This fact, however, is almost totally unrepresented in my blogging.
Actually, that’s part of why I’ve been blogging so much less recently. Since moving onto SONAR my brain has been afire with newly acquired knowledge and trying to figure out how best to apply it to work problems. This has left relatively little time for most of the other stuff I think about that normally generates blogging.
Of course the obvious solution is that I should be blogging about
computational linguistics. But that has some obstacles. Primarily:
Confidentiality
All the computational linguistics stuff I do is for work. I tinker
around with it at home, but haven’t really done anything useful. This
makes it difficult to know what I can blog about: I certainly can’t go
“HEY GUYS. I FIGURED OUT THIS AWESOME ALGORITHM WHICH WE’RE USING IN
SONAR” for everything. We rather rely on some of that magic to make us
money. :-)
That being said, there’s definitely stuff I can blog about.
e.g. there’s nothing particularly confidential in how we extract likely
candidate phrases from a document, and it’s at least mildly interesting
(probably more to non-linguists, but who knows?). In fact, we’re
actually all encouraged to blog more about what we do but never find the
time. So, really, work isn’t that much of an obstacle to blogging about
this. It just requires a bit of careful thought.
Experience
I’m very new to computational lingusitics. As such, I’ve a much less
clear idea what’s bloggable about in it. If we look at my
blogging history, I started blogging about programming in february
2007. That’s just shy of a year after I started working as a programmer
(which, effectively, is just shy of a year after I started programming
anything in earnest). And I think it took another six months of blogging
before I actually wrote anything worth reading. In comparison, I’ve not
even worked in computational linguistics for 6 months (I think I started
work on SONAR in september and had no exposure to it before that). So
I’m very much still sortof fumbling along, trying to figure out the best
way to do things.
From a work point of view that’s fine. Actually some of my best work is done when I don’t know what I’m doing: I’m more able to ask stupid questions and get useful answers and I come at things from a sufficiently different angle to normal that sometimes I produce unexpected results.
But from a blogging point of view it’s pretty likely that what I end up writing about will range from the trivial to the wrong, until I find my feet. Some of it might be of interest to non-linguists but too basic to be of interest to linguists. Some of it might be so esoteric that it would only be of interest to linguists, at least it would if they weren’t so easily able to point out why it’s wrong. Some of it might be of interest only to me.
But actually this is a really piss poor excuse to not blog about it.
Because, frankly, I do not write to amuse you. Writing for other people
is, to me, a waste of time. I write about what is of interest to me.
With any luck other people will find it interesting too, but that isn’t
the primary point.
So...
In conclusion, my two main reasons for not blogging more about
comptuational linguistics, natural language processing, etc. suck. So
expect to see more about it here in the future. This probably means
you’ll see more Ruby as well, as that’s what we use at work and I don’t
expect I’ll bother translating into Scala except when I have a specific
reason to do so.
Comments
Jan Berkel on 2009-01-27 01:34:17:
wow, great stuff david. while you write a blog post about not having enough time to write i could write a post (if i had a blog) about not finding enough time to read all the blogs i’m subscribed too. information overload, welcome to the 21st century.
david on 2009-01-27 01:38:09:
I keep meaning to write tools to help me manage the information overload, but I never find the time. :-)