This is a partial list of literature relevent to ht://Dig development, including sources on other web search engines, search algorithms, databases, fuzzy searching and other topics. It is by no means a complete bibliography of these topics but should include some good resources.

  1. Agirre, E., K. Gojenola, et al. (1998). ``Towards a single proposal in spelling correction.''
  2. Automatic Resource Compilation by Analyzing Hyperlink Structure and Associated Text
  3. Brin, S., J. Davis, et al. (1995). Copy Detection Mechanisms for Digital Documents. ACM SIGMOD Annual Conference, San Francisco, California.
  4. Brin, S. and L. Page (1998). The Anatomy of a Large-Scale Hypertextual Web Search Engine. The Seventh Annual International WWW Conference.
  5. Sergey Brin, Rajeev Motwani, Lawrence Page, Terry Winograd. What can you do with a Web in your pocket?
  6. Chakrabarti, S., B. Dom, et al. (1998). Automatic Resource Compilation by Analyzing Hyperlink Structure and Associated Text. Seventh International WWW Conference, Brisbane, Australia.
  7. Cho, J., H. Garcia-Molina, et al. (1998). Efficient Crawling Through URL Ordering. The Seventh Annual International World Wide Web Conference.
  8. Engineering, U. I. (1998). Why On-Site Searching Stinks.
  9. Fang, M., N. Shivakumar, et al. (1998). Computing Iceberg Queries Efficiently. 1998 International Conference on Very Large Databases (VLDB '98), New York.
  10. Golding, A. R. and Y. Schabes (1996). Combining Trigram-based and Feature-based Methods for Context-Sensitive Spelling Correction. The 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, CA.
  11. Monika R. Henzinger, Allan Heydon, Michael Mitzenmacher, Mark Najork. Measuring Index Quality Using Random Walks on the Web
  12. Kleinberg, J. (1998). Authoritative sources in a hyperlinked environment. The Nineth Annual ACM-SIAM Symposium on Discrete Algorithms.
  13. Kukich, K. (1992). ``Technique for automatically correcting words in text.'' ACM Computing Surveys 24(4): 377-439.
  14. Lawrence, S. and C. L. Giles (1998). Searching the World Wide Web. Science. 280: 98-100.
  15. Manber, U. (1997). ``A Text Compression Scheme that Allows Fast Searching Directly in the Compressed File.'' ACM Transactions on Information Systems 15(2).
  16. Marchiori, M. (1997). The quest for correct information on the web: Hyper search engines. The Sixth International WWW Conference, Santa Clara, USA.
  17. Mayfield, J. (1998). Research on N-Grams in Information Retrieval.
  18. Members of the Clever Project. Hypersearching the Web
  19. Muth, R. and U. Manber (1996). Approximate Multiple String Search. Seventh Annual Combinatorial Pattern Matching Symposium, Laguna Beach, CA.
  20. Page, L., S. Brin, et al. (1998). ``The PageRank Citation Ranking: Bringing Order to the Web.'' (work in progress).
  21. Pinkerton, B. (1994). Finding What People Want: Experiences with the WebCrawler. The Second International WWW Conference, Chicago, USA.
  22. Pollock, J. J. and E. M. Zamora (1984). ``Automatic spelling correction in scientific and scholarly text.'' Communications of the ACM 27(4): 358-368. .
  23. Rapp, R. (1997). Text-Detector. c't: 386.
  24. Shivakumar, N. and H. Garcia-Molina (1995). SCAM: A Copy Detection Mechanism for Digital Documents. 2nd International Conference in Theory and Practice of Digital Libraries, Austin, Texas.
  25. Shivakumar, N. and H. Garcia-Molina (1996). Building a scalable and accurate copy detection mechanism. First ACM Conference on Digital Libraries, Bethesda, Maryland. .
  26. Shivakumar, N. and H. Garcia-Molina (1998). Finding near-replicas of documents on the web. Workshop on Web Databases.
  27. Tillman, H. N. (1997). Evaluating quality on the net. Internet Librarian, Monterey, California.
  28. Wu, S. and U. Manber (1992). ``Fast Text Searching Allowing Errors.'' Communications of the ACM 35(October 1992): 83-91. .
  29. Wu, S. and U. Manber (1994). ``A Fast Algorithm for Multi-Pattern Searching.'' .

This is a growing list of various useful RFCs and other standards.

Last modified: $Date: 2001/06/13 14:31:49 $