Happy New Year. In 2006, I ended the year with a blog that I called a retrospective, ie looking back on the year past. In the interests of doing something different this year, I will end with an introspective. Some weeks ago, on a relatively slow afternoon, I found a link in a Splee's blog that led me to a couple of rather interesting personality tests, the results of which appear below:
Which Programming Language are You?
Which OS are You?
I don't normally do these tests, having about the same respect for their effectiveness as I would have in the prognostications of a Magic 8-ball. However, knowing what kind of programming language or operating system I resembled had a certain nerdy appeal. Besides, a fair indication of how good they were would be how easy or natural it felt working in these two environments, so it would be fairly easy to test how good the tests were, so I went for it.
While PHP would not be the first language I would identify with, I did find it very easy to work with both times I tried it, thanks in part to the great online documentation and examples. As for AmigaOS, I did not know anything about it, except for the fact that an OS once existed by that name. Reading about it, however, I think I would have liked working on it, and possibly even identifying with it, it being so far ahead of its time and all :-). Unlike the AmigaOS however, my parents actually went out of their way to give me opportunities for success, the fact that I am not as successful as the test would like me to be is probably solely my fault :-).
Or maybe I am just a gullible person, and in reality, these are just generic profiles which my mind coerces into the right shape to fit my self-image.
Be that as it may, regardless of my (or your) opinions on the predictive powers of a personality test, the more interesting question was, how do they work? Turns out that there is some math behind at least some of the tests. I came across some published material on this here, here and here.
Based on this, a possible algorithm may go like this. To train the personality analyzer, a large set of questions are asked of a set of control subjects, and their answers are correlated to their actual personality (determined by manual psychiatric evaluation). Each test would have a small and finite set of personality types that can be provided as a result (such as the 16 different personality types proposed in the Myers-Briggs classification). Each of the multiple-choice answers (given by the control group) to the questions are assigned a weight for each personality, which represents the computed probability of that choice being made by the member of that personality type. In pseudo-code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
for control_subject in control_subjects: for question in questions: answer = question.selected_answer personality_type = control_subject.personality_type total_answered[question, answer, personality_type]++ for control_subject in control_subjects: personality_type = control_subject.personality_type count_of_personality_type[personality_type]++ for question in questions: for answer in answers: for personality_type in personality_types: prob[question, answer, personality_type] = total_answered[question, answer, personality_type] / count_of_personality_type[personality_type]
At the end of this, each answer to each question will have a score representing the probability of a particular personality type selecting that particular answer to a question.
A prospective test-taker is assigned a random subset of the question set. The sum of the weights of his/her answers are computed for each personality type, and the type corresponding to the highest sum is offered as a result. In pseudo-code, again:
1 2 3 4 5
for question in question: answer = question.selected_answer for personality_type in personality_types: score[personality_type] += prob[question, answer, personality_type] return (max(score[personality_type])).personality_type
While on the subject of Statistics, I was in India recently and took some time out to meet my folks. One of them is a brilliant but rather acerbic old gentleman who started out as a physicist (as did I, hence the connection) and later turned his talents to mechanical engineering. I happened to mention that I used Statistics at work, and gave him an example of one such case. The conversation then veered to how statistics can help if all the variables in the problem are not fully defined, and he happened to comment that for problems where all variables are known, vector algebra could help, and in cases where the resulting matrix model is computationally intensive, he suggested using sets. I had recently discovered this (the hard way, by building both implementations) for myself, but I was surprised by his remark where I had not even defined a problem, leading me to think that this could be some kind of heuristic. Anyone know what I should be looking for if I want to know more about the reasoning behind such a heuristic?
On another completely different note, I indulged in an accidental bit of tourism, visiting Mount Abu for a day trip. I was really quite blown away by the Jain temples there. The ceilings and columns in the temples are intricately carved out of marble, and are built over a period of about 500 years starting in the 11th century. The ceilings are inverted semi-spheroids with lotuses and such hanging off the top, all carved out of one solid block of marble. According to the guide, one of the temples was built by the craftsmen themselves, in their spare time, using marble that was not perfect enough for the main temples and therefore discarded. Like most ancient structures, it is as much an engineering feat as a work of art, so in that sense, this could probably be considered an early example of skunkworks development.