Data science in society: humans as numbers

In «modern» society at some certain age you officially become a number. You are also a member of a community (family, friends,…) You have dreams, fears, needs, obligations and so on. Then, if you are a number, how can it represent you? Let us consider a basic number: your age. Like a month ago I…

In «modern» society at some certain age you officially become a number.

You are also a member of a community (family, friends,…) You have dreams, fears, needs, obligations and so on.

Then, if you are a number, how can it represent you?

Let us consider a basic number: your age.

Like a month ago I was walking with my dogs and at some instant I fell into the ground. Just like that. I got up and decided to get back home. After 5 minutes or so, I was in pain. Ten years younger this will be nothing but now, proper care and rest must be taking into account. So, my age does contain some information about me.

I am also a national ID number, phone number and a bank account number. Including my age, these four numbers combine and give rise to phone call: «Good morning. We contact you (from the bank) to offer health insurance and also (God forbids something happens to you) a life insurance…» I always laugh and make a stupid comment but continue the conversation. The (usually young) person offering these services needs to complete their assigned task. What a job…

Thus, I am a number whether I like it or not and my age is relevant not only for me.

Notice that someone in the bank (a team to be realistic) was counting. Since for large amount of data the actual counting is practically impossible, some probabilities must been used. To be more precise, I am sure that the age interval 30-40 was defined in some database and my account number was selected among others. Moreover, due to the social and political circumstances in Guatemala, there should be a chart in which assigns a probability (larger than 50% ) that a customer within this age interval is a parent. This may not be the only information, the probability also must take into account the age average of the children and financial behavior.

After a finite amount of calls, the offer stopped. Now they just simply offer me an amazing credit card and car insurance. They realized that my probability is actually less than 50%. Other products must be offer (I enjoy the thought that the algorithm actually cannot say anything realistic about by actual circumstance).

Definitely I am number.

One of the things that bothers me is that I’m not an interesting number like π or better 1/π.

Does It bother me that an algorithm tries to model my life? Yes, but naive probabilities cannot «see» specific details.

In technical words, fluctuations around an average value are difficult to interpret. To illustrate this point consider the following figure:

The horizontal line correspond to the average value of the colored points. The fluctuations indicate how the points scatter around this line. A naive probability will not distinguish the color of the points.

The criterion used for managing the fluctuations outside fundamental physics is more complicated since there are no laws of nature for guidance. In social sciences there are laws but they are not at the same stage as laws of nature. You may even have equations for social phenomena but they cannot be compared to the ones briefly mentioned in the last post.

The criterion is crucial to interpret the data. The fact that probability shield specific details of my life can also have negative repercussions. What if I suffer from a rare disease and thus I’m statistically irrelevant to the national health care system? (see the lowest orange point in the figure)

In the following post I’ll discuss mathematical tools for handling large amount of data. More important, how to carefully apply these tools in order to obtain information from the fluctuations.

Tags:

Deja un comentario