
WISDM: Web Indexing and Search for Dynamic Mining
The immense scale and wide spread has rendered the Web as an
ultimate information repository-- as not only the sources where we
find but also the destinations where we publish our
information. These dual forces have enriched the Web with all
kinds of data, much beyond the conventional page
view of the Web as a corpus of HTML pages, or "documents".
Consequently, the Web is now a rich collection of data-rich
pages, on the "surface Web" of static URLs (e.g, personal or
company homepages) as well as the "deep Web" of database-backed
contents (e.g, flights from aa.com). The richness of data, while a
promising opportunity, has challenged us for effectively finding
data we need. We propose to build novel search systems, to faciliate users' quest of data on the Web.
Projects
-
Entity Search: to propose and build a novel Web Search Engine beyond
document retrieval, that searches upon Entities,
for instance, email, phone number, address, etc.

-
Entity Extraction: to study general, scalable, and robust extraction frameworks that support various types of entity extraction tasks.
-
Relational Mining on the Web: to achieve entity relational mining, the next step after entity search results.
-
Object Search: to propose and build a search system that searches upon web objects represented in documents.
People
Alumni
-
Joseph M. Kelley
-
William Davis
Collaborators
Publications
- EntityRank: Searching Entities Directly and Holistically.
T. Cheng, X. Yan, and K. C.-C. Chang. In the Proceedings of the 33rd International
Conference on Very Large Data Bases (VLDB 2007), Vienna, Sep, 2007. [PDF]
[PPT]
- Supporting Entity Search: Towards Agile Best-Effort
Information Integration over the Web. T. Cheng, X. Yan, and K. C.-C. Chang. In the Proceedings of the 2007 ACM SIGMOD Conference (SIGMOD 2007)
(Demo Paper), Beijing, June 2007. [PDF]
- Entity Search Engine: Towards Large Scale Information
Integration on the Web. T. Cheng, and K. C.-C. Chang. In the Proceedings
of the 3rd Conference of Innovative Database Systems Research (CIDR 2007)
(Extended Demo Paper), Asilomar, Jan 2007. [PDF]
[PPT]
Technical Reports
- Weaving Entities into Relations: From Page
Retrieval to Relation Mining on the Web. J. M. Kelley, K. C.--C.
Chang, T. Cheng, S. Chuang, W. Davis. UIUCDCS-R-2006-2752, Department of Computer Science, UIUC, Nov 2004. [PDF]
Talks
- Entity Search: Finding Stuff on the Web, Directly and Holistically.
K. C.-C. Chang. A talk given at the Stanford InfoSeminar 2007, Stanford, Feb 9, 2007. [PPT]
- Finding stuff on the Web. K. C.-C. Chang.
A talk given at the Berkeley Database Group Seminar, Berkeley,
2006. [PPT]
Online Demos
Datasets