Noons, posted quite an entertaining essay – no moore (part 2). While reading it, I was agreeing all the way though, but something inside me was protesting. I couldn’t put my finger on it, so I started reading again from the top, and finally the thought materialized. I probably can’t pull it out as well as Noons can, but here I go. Oh… before you proceed you might want to read this post — especially the comments. That’s how it all started.
We humans are not able to process large amount of precise data. We can have several bank accounts and credit cards, but I bet very few of us are interested in the balance down to the cent/pence/kopeika. We want the broad figure — am I close to the red line? — is it enough for the coming monthly payments? and similar measures.
In any human–readable report, we don’t need more than ten or twenty lines of numbers. Every time we look at processes or data more complex that that, we employ simplifications — graph trends, mind–mapping, aggregations, and so on. So never mind petabytes, we humans can’t digest even megabytes of data. Anything larger, and we we have to simplify, aggregate, and average.
This is clear enough, but simplification and aggregation and such require processing huge amounts of information. However, for the absolute majority of tasks, we do NOT need a precise answer.
This gets us closer to ratings, averages, and approximations. Perhaps that will be the keyword in the future of data processing, “approximation”. What will be the indexes of future databases? Not bitmap indexes and not b-tree indexes, but approximation indexes based on estimates. And the answers delivered by the databases of the future will resemble “slow” or “fast” or “very fast”, instead of “159.4567 km/hour”.
I’m sure that techniques from artificial intelligence and business intelligence, disciplines undergoing a lot of development, will become core data processing principles at some point.
How will we make computers process information just as humans do? Perhaps we can find the answer inside ourselves if we figure out how our minds work. We recognize images even though our brain is not capable of processing huge amounts of data in milliseconds.
2 Comments. Leave new
Hmmm, yeah. I think I see your point and it is very valid. I’ll add-on here, the best I can.
Yes, the “fuzzyness” will be there. But not necessarily in the actual results.
I think we’ll see something along these lines:
1- The widespread use of what I call the “MV timewarp”. Users will accept a tradeoff between timeliness and speed of result. They’ll accept that the actual result will not include the latest, whitest and brightest RFID data but will be a precise result as of some time in the past. How far back is an unknown at this stage.
2- The fuzzyness you note will go into CBO stats gathering and its effect. For this scale of problem, the nuances between for example “very large” and “bloody huge” will be moot: after a few TB, it’s all “freakin humongous!” as far as 100MB/s is concerned!
or words to that effect!… ;-)
still: yes, there is indeed an argument for fuzzyness. But don’t forget this: the end user is still armed with that annoying Excel and used to seeing large precision…
Thanks for comment Noons.
Picking further ahead, bitwise computer logic might be the dead-end technology. Perhaps, this is our future?