Overall Research Goal
My overall research goal is to develop algorithms and systems to enable efficient and trustworthy information sharing and knowledge discovery over dynamic, heterogeneous, and massive-scale networked information systems like the Web, social media, mobile information systems, etc. These systems derive much of their value from their relative openness (in comparison to tightly controlled systems), which can lead to a host of great opportunities -- including explosive growth, self-organization, bottom-up discovery of "citizen experts", serendipitous discovery of new resources beyond the scope and intent of the original system designers, and so on -- but this relative openness and self-supervision can lead to vulnerabilities in the principles, design, and maintenance of these systems.
Hence, my research focus has both a positive and a negative dimension. On one hand, I focus on threats to these systems and design methods to mitigate negative behaviors; on the other, I look for positive opportunities to mine and analyze these systems for developing next generation algorithms and architectures that can empower decision makers. For both, my research approach is primarily experimental in nature; I like to develop algorithms, models, and systems, and then evaluate their effectiveness over real data and in live settings.
Research Thrust 1: Information Quality in Open Systems
My first major research thrust is focused on countering threats to the quality of information in open and self-managing systems like the Web and online social networks. Our efforts range up the cognitive stack, from lower-level automated spam to higher-level deceptive persuasion.- Spam-Resilient Web-Scale Computing: overview, PODC 2007, TPDS 2009, ...
- Defending Socio-Computational Systems: overview, JCDL 2008, SIGIR 2010, ...
- Strategic Manipulation and Adversarial Propaganda: overview
Research Thrust 2: Web-Scale Mining and Information Management
My second major research thrust is focused on analyzing and mining large-scale information networks for designing and developing new algorithms and architectures for empowering decision makers and enabling new modes of information discovery.- Deep Web Information Retrieval: overview, TKDE 2005, SIGIR 2006, ...
- Mining Social Media: overview, CIKM 2011, ICWSM 2011, ICWSM 2008, ...
- Real-Time Web Analytics: overview, WSDM 2011, SIGIR 2011 demo, ...
Toward a Third Research Thrust: In Situ Human Computational Systems.
As part of my NSF CAREER award, I am building on my two research thrusts and extending towards a third centered on real-time search and computation. The key motivating idea is to link crowdsourcing approaches popularized for human computation (e.g., Amazon Mechanical Turk, the ESP Game) to the real-time crowds that manifest in the wild, in effect "closing the loop" for rapid decision-making by stakeholders using the system.