發佈於： Wed, 14 Aug 2019 +0800
Mirrors in Chuanghua (2.7.1628.12):
User counting hasn’t been in the spotlight for a while, but it has always remained something we’ve been thinking about in the background. One issue that’s been brought up a few times is that people see the use of a unique ID to distinguish between unique users as a form of tracking.
While the unique ID on its own is not enough to do any meaningful tracking, I can still understand that it gets people concerned. In a world where privacy breaches have become somewhat of a norm, any form of identifier is scary.
And so, when it was brought to my attention a few months back that there was maybe a way to handle user counting without it, I was immediately interested in implementing it. In addition, since user counting is going to be needed on our upcoming Android version, it felt like a good time to reimplement user counting altogether in a more cross-platform fashion and start working towards removing the unique ID from our requests.
However, changing a user-counting solution should not be taken lightly. I’ll get into a detailed explanation below, but if you want the tl;dr version, this is what you can expect:
- Starting with our upcoming version Vivaldi 2.7, an additional request to our user counting endpoint will be made. This request is similar to the current one and includes the unique ID, but contains additional parameters that will be used by the new unique ID-free implementation.
- A few versions later, the old user counting request will be removed.
- Even later, the unique ID will be eliminated from the new request altogether. We will keep generating it locally to aid with counting on computers with several Vivaldi installations, but it will only be used locally.
Note that the code used to generate the new request is written entirely in C++ and will be published with our source releases, allowing you to check that the code does what we claim it does.
So, why are we doing this in a slow and convoluted way? It’s because…
User counting has to remain accurate
We want to make sure that the new code I am introducing is working as intended and reporting the same numbers as the old code. So, the first step is to make the new implementation do exactly the same as the old one so that the server can keep counting by unique ID, as it used to. We can then check that the new user counting code reports the same number as the old one before decommissioning the old one.
Afterwards, my colleague Claudia will handle the task of setting up the servers to be able to count without a unique ID. We may need some time to tweak the code both in Vivaldi and on the server in order to get the same result with and without a unique ID. Finally, once everything works, we’ll drop the unique ID from the requests altogether. Hopefully, not too much tweaking will be required, but we do need to ensure everything is fine since…
User counting needs to meet some requirements
The basic idea is simple enough. If we set Vivaldi to send a request to the server once per day and that we count the number of requests over the course of 24 hours, we will know how many users were using Vivaldi during that day. That sort of information is nice if we want to see the direct effect of an event over the number of users, but often it’s more useful to know how many people were using Vivaldi in a given week or a given month. That sort of number smoothes out the drop of daily usage that happens during weekends or holidays. The idea to get weekly or monthly number is the same. Just send a somewhat different request every week/month and count the amount of those requests over the same time period.
There are a few more bits of information that the unique ID gives us that we want to replicate. First, we want to know when we are getting a new user. We can easily detect locally whether Vivaldi is being run for the first time, so we send that information with the request.
Next, it is useful to know how long someone has been using Vivaldi. We want to make a browser that people want to use in the long run, so checking that there are enough people out there that stick with us is quite important. So, we add the installation week to each request.
Lastly, we send the number of days during which Vivaldi was prevented from reporting (for any reason). This should allow us to have a better picture of returning users.
In addition to all this, we still get the CPU architecture and the screen resolution of the computer on which Vivaldi is running as well as the user agent. This information isn’t related to user counting, but it lets us know what sort of machines we are designing Vivaldi for.
All this is in line with what other privacy-focused companies are doing when it comes to counting users. It would be easy enough if that was all we needed to know but the unique ID also lets us easily handle…
Counting with multiple Vivaldi installations on the same computer
Vivaldi offers a few options to have multiple instances of the browser running on the same system and ways to take a full Vivaldi installation from one machine to another. The Standalone option in the Windows installer will allow this, for instance. These abilities complicate user counting as it allows for two situations.
The first case occurs when someone installs Vivaldi multiple times on their computer, using separate profiles, mostly for testing. In such a case we want to count this as one user and the different installations must be able to know that they are by the same user, so that they can coordinate to only send one daily request in total instead of one daily request per installation.
The other situation happens when several people use their own standalone Vivaldi installation (using their own portable drive) on the same computer, on the same Operating system account. In such cases, each installation has to know that it is used by a different person and report on its own, without interfering with other installations on the system.
To be able to distinguish between those cases, our solution so far has been to keep one copy of the unique ID as part of the LocalState file and one copy within the OS user profile. If one of the copies is missing, it is set up again using the other available copy. If both copies are present but do not match, we can assume that we are running a standalone installation that was moved to another system (second scenario). In all other cases, we assume the first scenario.
Because I have not found a better way to distinguish between those two situations, we will keep generating and storing unique IDs after we have stopped sending them. They will only be used to know whether Vivaldi has to cooperate with other installations on the system to make sure to only be counted once.
All this seems like a lot of minutiae just for user counting, but the reality is that…
Accurate user count is important
Having more users gives us more ability to do the things that make Vivaldi a great browser. This mostly comes down to the partnerships that Christian mentioned in a recent blog post which I recommend reading.
To build such partnerships and being able to get good deals with our partners, it helps to be able to tell that we have many users. For the people we partner with, having more users simply means that they can reach more people and this makes it more interesting to get a deal with us.
In addition to the partnerships we use for generating revenue, this also applies to more technical partnerships. When we implement a feature that supports specific OS functionality, or specific hardware, like the Razer Chroma support we implemented a couple of versions ago, we are more likely to get good technical support from the company we are talking to if they know our support of their feature will reach more people.
However, regardless of what our actual user count is, it can only be taken seriously if we can show we have done everything we could to ensure its accuracy. High numbers mean nothing if they are not backed up by a solid counting solution.
This is why user counting is something we have implemented very carefully. It must balance the need for accurate numbers with the requirement of keeping only the strict minimum amount of information about our users.
I hope that the change from counting based on a unique ID to counting based on simply counting requests will make you even more confident that we are doing everything possible to avoid any form of tracking.
If you have any questions or need any clarification about this surprisingly complex topic let me know in the comments and I’ll do my best to answer.
Main photo byCrissy Jarvis on Unsplash.