The Patient Zero Syndrome
CAVEAT: This document contains comments about an article published by Bloomberg LP: my current employer. All the opinions expressed herein are my own and they do not reflect the one of my employer.
Quite recently, Facebook announced the release of a new algorithm to reduce — if not removing altogether — clickbaiting links from the Newsfeed.
First of all, what's clickbaiting? Clickbaiting is a online monetization strategy that consists into driving traffic to questionable quality pages for the only purpose of selling advertisement on those pages. The more people land on a page, the more money the owner of the page makes.
Why "questionable quality"? Usually those techniques work pretty well if you have a lot of incoming traffic as the more people go there the more money the owner of the site makes. As Facebook Newsfeed consider freshness an important feature, it tends to privilege new content — or content with recent activity — to older content. This means than the editors need to create a lot of content to stay relevant on the Newsfeed and the fastest way to write new content, without having an army of journalists, is to publish an awful lot of shit: articles without any meaningful content, posting over and over well known things, news that were not, etc.
Why should I click on those links? Well, you shouldn't, and you maybe don't, but there are a lot of people who do just because the title of those articles is catchy or they leave the curiosity to know more, e.g., "This woman went to the seaside and what she found was unbelievable!", or "What doctors don't tell you…", etc. you got me.
I don't click on them but they often appear in my Newsfeed just because someone of my contacts bounced or liked it. I hate that.
So, the move of Facebook to remove all the clutter from the Newsfeed seems pretty good and you could expect everyone but the guys managing those websites to be happy, and moreover, the real journalists should be the happiest of them all, am I right? No, I am not.
With my great surprise, the most vocal opponents to this change are the journalists themselves. For instance, I was particularly stroke by an opinion published by Bloomberg.
The article is well done and complete. It explains what the change is about, provides links and charts, etc. and at the end it says «At this stage of its development, artificial intelligence is terrible at processing human languages, and letting it police content is premature. The tweaked algorithm would probably classify the New York Times' and the Guardian's sarcastic headlines about Facebook as clickbait because they contain the word "shocker" and the expression "you won't believe." Silicon Valley companies are overconfident in their technology. They have to admit humans are better at content, at least for now». Aside from the naive assumption that a system like that will not use white and black lists to keep or discard certain sources without even passing through the classifiers, I think that the author is right thinking that we are not there yet technologically speaking, that a machine cannot be very good at getting genuine sarcasm — neither some people I know do — and that this system will do more harm than good at the beginning for all those reasons. But at the very same time I don't care.
What I mean is that initially the new algorithm of Facebook will screw up things demoting good articles as well as leaving some bad ones pass through, or it will be too aggressive demoting everything, or not aggressive at all being totally useless. But that's fine. If you want to solve this problem, you have to start from somewhere: you do something, you screw up, you do something else, you screw up a little less, and so on and so forth. Facebook has all the data and the expertise to do this right in the long term and I don't think either that they will do that right from day one. The real fear that this journalist, and many others, have is to be the patient zero and in all honesty, I am with him: being the patient zero sucks, but someone has to be.
The term patient zero comes from the medical field: it's used to refer to the first person getting a particular pathology or being infected by some bacteria or virus and that, with a good probability, will not survive it. On the other hand, the patent's demise will be useful to save others as the case will be studied and analysed to make medicine progress.
Now, I don't think that anybody is going to die for this: the journalists will struggle, their articles will be demoted, they will not understand why the engagement will go down for very beautiful articles, etc. but in the long term, things will get better for them as well as proper articles will end up promoted and the clickbait links will be demoted… until the next time that someone will find a brand new way to hijack Facebook for his own interest.
You cannot really believe to be able to do everything right at the first shot but you have to start. Who is there when this starts is going to be impacted most, in the good and in the bad. It sucks but it's the way it is.
On the other hand, I believe the point made by the author in the paragraph immediately above is a good argument; verbatim: «If somebody wants to block certain types of headlines — because they are manipulative, or for any other reason — they should have access to filters to personalize their news feed. Instead, Facebook presents users with a black box for fear of having its algorithm reverse-engineered. That's worse than not messing with the natural flow of posts at all». Keeping people in control, and enable them to review what was dropped from the newsfeed, is probably the key point and I honestly believe that this would be useful to Facebook as well without the need of showing the code to anybody, as in this case and in many others in machine learning, it's the data that matter, not the code. If I am in power of saying "dear Facebook, you got this wrong here" and allow some content back in into the newsfeed, Facebook could leverage my judgment to make the system better — notice that a system like this could be used to hijack the classifiers/filters themselves and then it would pretty dangerous. It is possible that Facebook will let through some content that was classified as bad to continue training the system and getting the classifiers better analysing how users react to such content creating a continuous feedback loop.
Drawing a conclusion, I think that Facebook trying to prevent the propagation of some content is generally a good thing as long as we, as users, could have some control of at least part of it. At the same time, we should also accept that changes could be pretty bad at least at the beginning but that at the same time we need to start from somewhere, otherwise we will just get nowhere.