Academic
Publications
Discovering Routines from Large-Scale Human Locations using Probabilistic Topic Models

Discovering Routines from Large-Scale Human Locations using Probabilistic Topic Models,10.1145/1889681.1889684,KATAYOUN FARRAHI,DANIEL GATICA-PEREZ

Discovering Routines from Large-Scale Human Locations using Probabilistic Topic Models   (Citations: 2)
BibTex | RIS | RefWorks Download
In this work we discover the daily location-driven routines which are contained in a massive real- life human dataset collected by mobile phones. Our goal is the discovery and analysis of human routines which characterize both individual and group behaviors in terms of location patterns. We develop an unsupervised methodology based on two differing probabilistic topic models and apply them to the daily life of 97 mobile phone users over a 16 month period to achieve these goals. Topic models are probabilistic generative models for documents that identify the latent structure that underlies a set of words. Routines dominating the entire group's activities, identified with a methodology based on the Latent Dirichlet Allocation topic model, include "going to work late", "going home early", "working non-stop" and "having no reception (phone off)" at different times over varying time-intervals. We also detect routines which are characteristic of users, with a methodology based on the Author-Topic model. With the routines discovered, and the two methods of characterizing days and users, we can then perform various tasks. We use the routines discovered to determine behavioral patterns of users and groups of users. For example, we can find individuals that display specific daily routines, such as "going to work early" or "turning off the mobile (or having no reception) in the evenings". We are also able to characterize daily patterns by determining the topic structure of days in addition to determining whether certain routines occur dominantly on weekends or weekdays. Furthermore, the routines discovered can be used to rank users or find subgroups of users who display certain routines. We can also characterize users based on their entropy. We compare our method to one based on clustering using K-means. Finally, we analyze an individual's routines over time to determine regions with high variations, which may correspond to specific events.
Published in 2011.
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
    • ...In contrast with other datasets to which LDA has been successfully applied (for example, the GSM-localization part of the Reality Mining dataset used in [7]), Google Latitude records more complex and finer-grain data...
    • ...API (http://code.google.com/apis/latitude). For each user, the dataset consists of a set of records storing the time-stamp, longitude and latitude and the accuracy of the measurement (see Fig. 1). Since we are interested in extracting routine behaviors of individual users we focused on long-term finegrain data acquisition from few users rather than on coarse data from a large user population (like e.g., [7], [6], [13])...
    • ...In comparison with previous works [7], [6], our Google...
    • ...Daily representation. In the second step, following the approach proposed in [7], [6], we organized the dataset into a sequence of days each consisting of 48 time-slots lasting 30 minutes each...
    • ...The estimation of the optimal number of topics is an active research challenge and some mechanisms have been proposed to guide this choice [3], [7]...
    • ...Other clustering mechanisms are not able to identify that cluster since they consider whole days only [7]...
    • ...In [7] authors propose the use of probabilistic topic models to capture human routines from cell tower connections...
    • ...In comparison with [7] our work uses a more complex dataset, thus allowing to analyze the topic...
    • ...As above mentioned, the geographic coordinates provided by Google Latitude allows to enrich the location vocabulary with a higher number of places (in contrast with the ‘home’, ‘work’ and ‘elsewhere’ label used in [7])...

    Laura Ferrariet al. Discovering daily routines from Google Latitude with topic models

    • ...Farrahi and Gatica-Perez [11] presents an interesting work in the mobile domain using also the LDA model to discover routines...

    Federico Castanedoet al. Modeling and Discovering Occupancy Patterns in Sensor Networks Using L...

Sort by: