mirror of
https://github.com/classilla/tenfourfox.git
synced 2024-06-10 18:29:43 +00:00
84 lines
2.8 KiB
ReStructuredText
84 lines
2.8 KiB
ReStructuredText
.. _healthreport_identifiers:
|
|
|
|
===========
|
|
Identifiers
|
|
===========
|
|
|
|
Firefox Health Report records some identifiers to keep track of clients
|
|
and uploaded documents.
|
|
|
|
Identifier Types
|
|
================
|
|
|
|
Document/Upload IDs
|
|
-------------------
|
|
|
|
A random UUID called the *Document ID* or *Upload ID* is generated when the FHR
|
|
client creates or uploads a new document.
|
|
|
|
When clients generate a new *Document ID*, they persist this ID to disk
|
|
**before** the upload attempt.
|
|
|
|
As part of the upload, the client sends all old *Document IDs* to the server
|
|
and asks the server to delete them. In well-behaving clients, the server
|
|
has a single record for each client with a randomly-changing *Document ID*.
|
|
|
|
Client IDs
|
|
----------
|
|
|
|
A *Client ID* is an identifier that **attempts** to uniquely identify an
|
|
individual FHR client. Please note the emphasis on *attempts* in that last
|
|
sentence: *Client IDs* do not guarantee uniqueness.
|
|
|
|
The *Client ID* is generated when the client first runs or as needed.
|
|
|
|
The *Client ID* is transferred to the server as part of every upload. The
|
|
server is thus able to affiliate multiple document uploads with a single
|
|
*Client ID*.
|
|
|
|
Client ID Versions
|
|
^^^^^^^^^^^^^^^^^^
|
|
|
|
The semantics for how a *Client ID* is generated are versioned.
|
|
|
|
Version 1
|
|
The *Client ID* is a randomly-generated UUID.
|
|
|
|
History of Identifiers
|
|
======================
|
|
|
|
In the beginning, there were just *Document IDs*. The thinking was clients
|
|
would clean up after themselves and leave at most 1 active document on the
|
|
server.
|
|
|
|
Unfortunately, this did not work out. Using brute force analysis to
|
|
deduplicate records on the server, a number of interesting patterns emerged.
|
|
|
|
Orphaning
|
|
Clients would upload a new payload while not deleting the old payload.
|
|
|
|
Divergent records
|
|
Records would share data up to a certain date and then the data would
|
|
almost completely diverge. This appears to be indicative of profile
|
|
copying.
|
|
|
|
Rollback
|
|
Records would share data up to a certain date. Each record in this set
|
|
would contain data for a day or two but no extra data. This could be
|
|
explained by filesystem rollback on the client.
|
|
|
|
A significant percentage of the records on the server belonged to
|
|
misbehaving clients. Identifying these records was extremely resource
|
|
intensive and error-prone. These records were undermining the ability
|
|
to use Firefox Health Report data.
|
|
|
|
Thus, the *Client ID* was born. The intent of the *Client ID* was to
|
|
uniquely identify clients so the extreme effort required and the
|
|
questionable reliability of deduplicating server data would become
|
|
problems of the past.
|
|
|
|
The *Client ID* was originally a randomly-generated UUID (version 1). This
|
|
allowed detection of orphaning and rollback. However, these version 1
|
|
*Client IDs* were still susceptible to use on multiple profiles and
|
|
machines if the profile was copied.
|