What are the chances?

I’m bad at math, perhaps.

I got 13% for Math on the university entrance exam. Number-crunching courses and those that I have flopped due to complex calculation exceeding my mental processing capability make up the majority of my academic struggle. I failed to even get half of the last Advanced Calculus exam done in the allotted time. I don’t even know how to Fourier transform without a reference table and at least an hour of free time.

I would be given 15 minutes for the same problem on the exam.

So what are the chances of me working in the field of Statistics? I dare say 3.4 in a million chance at 95% confidence level, give and take 5% margin of error, is an optimistic estimation I would give myself. And in case anyone is wondering, that abysmal percentage is the defective rate in global leaders’ production standard.

The chance of me getting involved in Statistics is an error.

Surprisingly enough, I enjoy Statistics and it is perhaps the only math-related subject I would study in my free time as a long-lasting passion.

I remember when our roleplaying group was still together, I and Anata Ein spent hours on sync.in chat pad, discussing about various forum dwellers in search of fresh blood. He was the mastermind behind everything, he knew politics and its shady practices. I provided him with data, I scrutinized people’s background and pieced together a profile from conversations and other scattered sources. It would take me only a moment to look at the record in order to match an unnamed in the pad to a profile (provided they had one) based on network IP, browser’s version, verbal cues and timezone.

It was the start of my awkward relationship with data. I love getting new information. I love the immense sense of power and insight data grants. But without Statistics, my romance with data is a one-sided admiration from a far. I can only perceive the very good and the very bad and never the context, trusting blindly in the immutability of perceived qualities. I could never deeply understand what data is beyond the surface.

The job I took on this summer lies outside my comfort zone. My mentor was Gloria Quizon. She is the company’s Six Sigma champion, an expert business process analyst with more than ten years of data-crunching experience under her belt. Her belt was black; the second highest rank attainable in Six Sigma discipline. And at the time, she was studying in her free time to become the third Master Black Belt in the entire corporation.

In the past two months, I studied Statistics under her guidance. The learning curve is steep. It means rapid progress. Previous ventures in data collection and programming automated compilation process. First-hand experiences in book-keeping enabled swift understanding of terminologies and experimental design concepts.

It’s true that the quickest way to pick up new knowledge is by anchoring it to existing know-how.

When it came to compiling 800+ data logs into a single spreadsheet with all the calculation steps to process the data first, I didn’t have to spend two hours working on five logs. I spent five hours working on a programming solution that crunched six-month worth of data log, everything, in 20 minutes.

20 minutes to go from mass downloader to XBar-S processor to a spreadsheet that displays Cpk calculation, z-test statistic evaluation and even recommended corrective action applicable to six production lines, two types of data logs each. Such is the power of automation in the office.

When I created Machines of War AG three years ago, I was posed with the same data management challenge. I’m going to cover the solution in another blog post.

My first planned novel story was named “Imperial Experiment”, I did a lot of researches on experimental design and survey techniques for the story. They ended up helping me when it comes to learning how to do these tasks properly. In fact, what I learned from Design of Experiment will be of great value when I put my pen down and start writing Imperial Experiment.

After White Destiny, of course.

Looking to the future, I can already see countless of applications for Statistics. According to a recent article in NYTimes, different statistic tests conducted by professional can arrive at different conclusions. They aren’t necessarily incorrect or flawed by design. There is no right or wrong in statistics. It can be a matter of perspective in which a lurking factor unknown by either of the tests leads to different responses.

It is then the researcher’s job to define a better model.

Since I took an interest in Statistics, I have never been able to see many things in the same light again. Whereas in the past, I would use a structured logical argument, aka a debate, to settle a dispute, I now realize such a method would be too subjective. Debating is not the same as proving, debating is persuading.

These days, I would put the Diplomacy check asides and ask: “Show me the supporting data, show me your method and I will tell you what the chances are”.

Only data can objectively decide what’s true and what’s false and at which level of certainty. But one must ask, he must speak in the analytical language to receive the answer. Depending on how he phrases his question, he can get a nod, a shake or a shy indecisive blush.

I still adore data as the quiet yet perceptive girl who knows all about the wonderful world around her; who speaks no more than what one can ask and hides herself and only comes out when someone looks for true value behind the nonfactual flaws.

But I don’t trust the eyes that cannot tell her from the errors. I don’t trust the Statistics that I haven’t questioned myself. Or to be more precise…

I can no longer.


