Sunday, August 30, 2009

Data Mining

Data Mining

Data have different values in terms of capacity to align to or with other data. The more basic or fundamental the datum, the more value it has. Of course, this is all relating to true or valid data. Hitting upon a basic, valid datum is similar to hitting upon a rich vein or lode of ore. As you work down the vein you may be able to generate many new understandings by aligning this basic datum with older data and new observations. Understandings are generated through correlation and cross correlation with “known good” data.

A set of such “known good” data is a great tool for data mining, and for testing the validity of new data as it is encountered. The idea is to develop a coherent data set which has been largely or even completely cross-correlated so that if it is accessed or “touched” at any point then all cross-correlations from or to that point are immediately known and available as self-determined understandings. The coherency is a test of validity. There should be no datum that has to be “crammed in sideways” to make it fit with the others. A set of coherent data is a good tool for testing the validity of any new data encountered. If the new datum is valid it will fit in well with all parts of the coherent data set.

A person might want to have more than one of such coherent data sets. A different set can be used for different areas of life or work. The data set or sets would be expected to grow in size with time as the set is “worked” through cross-correlating within itself as well as with new data encountered. The larger the set the easier it becomes to add new valid data and to reject newly encountered false data.

Data Prospecting

Where would a person look to find the makings of a coherent data set? Look for data that govern areas of activity that have operated at a high level over a long period of time, or areas that have shown jumps of productivity.

A prime example from biology is the organization of cells of the body into tissue and organs. Animals and plants have been quite successful over a long period of time. The success, in general, seems to come more from order built into the system of organization of cells rather than from any particular intelligent entity occupying or governing such organizations of cells. There is much more variance between these intelligent entities than is observed between individual plants or animal bodies.

An example from mechanics is the internal combustion engine, or the steam engine. Civilization really took off from the Middle Ages into the Industrial Revolution and beyond as these motive forces were developed and brought into widespread use.

No comments:

Post a Comment