Sciweavers

808 search results - page 145 / 162
» Keyword-based document clustering
Sort
View
175
Voted
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
16 years 1 months ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
SIGIR
2009
ACM
16 years 28 days ago
Compressing term positions in web indexes
Large search engines process thousands of queries per second on billions of pages, making query processing a major factor in their operating costs. This has led to a lot of resear...
Hao Yan, Shuai Ding, Torsten Suel
ICSE
2009
IEEE-ACM
15 years 11 months ago
A-SCORE: Automatic software component recommendation using coding context
Reusing software components (e.g. classes or modules) improves software quality and developer’s productivity. Unfortunately, developers may miss many reusing opportunities since...
Ryuji Shimada, Yasuhiro Hayase, Makoto Ichii, Mako...
183
Voted
COMAD
2009
15 years 7 months ago
Querying for relations from the semi-structured Web
We present a class of web queries whose result is a multi-column relation instead of a collection of unstructured documents as in standard web search. The user specifies the query...
Sunita Sarawagi
CIKM
2007
Springer
16 years 18 days ago
Regularized locality preserving indexing via spectral regression
We consider the problem of document indexing and representation. Recently, Locality Preserving Indexing (LPI) was proposed for learning a compact document subspace. Different from...
Deng Cai, Xiaofei He, Wei Vivian Zhang, Jiawei Han