Drawbridge Cross-Device Connection Challenge
Abstract:
In this contest participants are tasked with identifying a set of user connections across different devices without using common user handle information, such as name, email, phone number, etc., for the purpose of proving that a technological, probabilistic approach to cross-device identity is a viable alternative to relying on deterministic information. The participants will be given various behavioral observations associated with anonymous IDs that have been synthetized from different devices and digital environments. Participants will then be asked to figure out the likelihood that a set of different IDs from different domains belong to the same user, and at what performance level.
General description of the problem, competition task, and evaluation metrics:
It is very common for one user to have multiple unique identifiers across different domains, including personal computing devices and mobile devices. While those different device identifiers are seemingly disconnected, they can illustrate more or less common behaviors across them, due to the fact that they often belong to the same user. A common method for bridging these device identifiers involves relying on deterministic identifiers, such as names, email addresses, phone numbers, or other personal information. In this contest, we are interested to see whether it is technically possible to infer multiple device identifiers belonging to the same user using a probabilistic approach, and the level of accuracy of those predictions. For the above purpose, we have synthesized a set of desktop browser cookies, say C, and another set of mobile device identifiers, say D, as two specific examples of users’ digital identities on desktop and mobile devices. We ask participants to use machine learning and data mining methods to identify when a mobile device identifier, say Di, and a desktop browser cookie, say Cj, belong to the same user, without using common deterministic user information. Specifically, for each mobile device identifier Di in the set D, we want the participants to provide a ranked list of desktop cookie Ri,j from the set C,based on the likelihood that those desktop cookies belong to the same user as device Di. Participants can use any methods they would like, as long as they provide the ranked list. The submission will converted to a P-R curve based on the ground-truth, and will be measured based on area under the curve (AUC) metrics on the P-R curve.
Workshop:
There will be a workshop dedicated to the challenge, as part of the ICDM conference program.
Contest Website:
https://www.kaggle.com/c/icdm-2015-drawbridge-cross-device-connections
Important dates:
Contest starts: June 1st
Contest ends: August 24th
Workshop paper initial submission due: September 7th
Workshop paper feedback sent to author: September 14th
Chandan K. Reddy Wayne State University, USA |
Takashi Washio Osaka University, Japan |