INF 703 Proseminar in Information Organisation

Spring, 1998
Course homepage: http://www.albany.edu/acc/courses/inf703.spring98.html
Course Newsgroup: sunya.class.inf703
Course E-Mail: inf703@cnsunix.albany.edu
Course Meeting Time: W 4:15 - 7:05 Room: Draper 146

Faculty Team

J. Peter Seagle Hemalatha Iyer Jagdish S. Gangolly (Co-ordinator)
BA 315 Draper BA 333
jps33@cnsunix.albany.edu hi651@cnsvax.albany.edu gangolly@cnsunix.albany.edu
/business//faculty/seagle.html /sisp/faculty.html#fac_iyer /acc/gangolly

Catalog Description:

Examination of the organization of information from the perspectives of data base systems conceptualization, structure and design; classificatory and data ordering principles that facilitate information retrieval; informetrics, including knowledge production and representation patterns, cognitive, semantic and citation/consultation factors.

A More Honest Description:

The organisation of information is examined from various related perspectives including

  • Organisation and retrieval of structured data in relational, network, hierarchical and object-oriented databases, including a discussion of topics relating to concurrency,recovery, and physical layout in client/server architectures.
  • The role of indexing languages and abstracting in the organisation and retrieval of text databases.
    • Classificatory structures and cognitive approaches to categorisation.
    • Methods for indexing, classification & thesauri construction, and their role in the retrieval of information from text databases.
    • Models for text retrieval including vector-space, probabilistic (Bayesian), and natural language based retrieval models, and their implementation in retrieval systems such as SMART, INQUERY.

    Texts:

    The basic required textbooks for the course are:

  • Readings in Information Retrieval, Karen Sparck Jones & Peter Willett
    (Morgan Kauffmann Publishers, Inc,1997).
  • Internet 101, Wendy Lehnert (Addison Wesley Longman, 1998)
  • We also recommend that you have one of the following books (those who have the El Masri & Navathe text need do nothing; those who do not have it are recommended the Ullmann & Widom text).

  • A First Course in Database Systems Jeffrey Ullman & Jennifer Widom (Prentice-Hall, Inc., 1997)
  • Fundamentals of Database Syatems Ramez Elmasri & Shamkant Navathe
    (Addison Wesley Longman, 1994).
  • In addition to the above, there will be a number of readings assigned as stated in the schedule below. Each reading in the schedule will be either placed in the Dewey Library in the downtown campus, or a link provided to an appropriate site on the internet. This course page is continuously updated during the semester, and therefore it is important that you visit this page often.

    Unfortunately, both the following classic texts on Information Retrieval are out of print. We shall be placing on reserve in the Dewey library copies of

  • Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer, Gerard Salton (Addison Wesley, 1989).
  • The other classic by van Rijsbergen, has fortunately been placed on the internet in .pdf format. You may like to download and refer to it.

  • Information Retrieval, C.J. van Rijsbergen (Department of Computing, University of Glasgow, 1998).
  • Class Conduct:

    Each class meeting is split into two sessions, and usually two different faculty from the INF703 team will cover them. In the schedule below, the two sessions are named I and II. The classes will consist of lectures, discussion of assigned papers, and student presentations.

    Course Evaluation & Grading:

    The final course grade will depend on the following components:
    • A take-home examination on the database component of the course (13%) (Seagle)
    • A database project (12%) (Seagle)
    • A group thesaurus project (37.5%) (Iyer)
    • A retrieval project (37.5%) (Gangolly)

    TENTATIVE SCHEDULE

    January 21, 1998

  • I. Introduction to the course: The INF703 team will present the course requirements, and introduce the course.
  • II. The Relational Model: Relational Algebra (Seagle)
  • Readings:
  • Codd, E.F., "Relational Database: A Practical Foundation for Productivity," Comm. ACM, 25, 2, (February 1982) 109-117.
  • Elmasri, Ramez and S. B. Navathe Fundamentals of Database Systems, Sec. Ed. (Benjamin Cummings, 1994). Chapters 1-4 should be familiar. The first class will focus on ER Diagrams, Chapters 3, as well as the relational model and the associated algebraic operations.
  • Tannenbaum, Andrew, Computer Networks, 2nd. Ed., Prentice Hall, 1988, pp. 9-30.
  • Vetter, Ronald J., "ATM Concepts, Architectures, and Protocols," Comm. ACM, 38, 2, (Feb. 1995), pp. 30-38.
  • January 28, 1998

  • I. Indexing Languages: Thesaurus in retrieval; models, design and construction(Iyer)
  • Readings:
  • Rowley, Jennifer . (1992) The Subject Approach: Introduction, processes, tools and simple evaluation in Organizing Knowledge. Brookfield, Vermont: Ashgate, chap. 12, pp.159-175.
  • Alphabetical indexing languages (1992) in Organizing Knowledge. Brookfield, Vermont: Ashgate, chap. 16, pp.242-278.
  • Kingsland, LC, Habourt, A.M., Syed, E.J. and Schuyler, PL.(1993) "Coach: Applying UMLS knowledge sources in an expert system searcher environment" Bulletin of Medical Library Association, 81(3), 178-183.
  • United Nations Educational, Scientific and Cultural Organization. (1981). Guidelines for the Establishment and Development of Monolingual Thesauri. Paris: UNESCO.
  • Chamis, Alice Y. Vocabulary Control and Search Strategies in Online Searching. (Westport, Connecticut: Greenwood Press, 1991)
  • Lancaster, F.W. Vocabulary Control for Information Retrieval (Arlington, Va.: Information Resources Press, 1986)
  • II. Relational Model (Continued) (Seagle)
  • February 4, 1998

  • I. Indexing Languages: Thesaurus in retrieval; models, design and construction (Continued) (Iyer)
  • II. Implementation Issues: (Concurrency, Recovery, Client/Server Architectures) (Seagle)
  • Readings:
  • Elmasri & Navathe, Chapters 17, 18, and 19 (pp. 577-588)
  • Vaughn, Larry, Client/Server System Design and Implementation, (McGraw-Hill, 1994), pp.3-34.
  • February 11, 1998

  • I. Indexing Systems (Iyer)
  • Readings:
  • Farrow,J.F. (1991). "A cognitive process model of document indexing." Journal of Documentation June; 47(2):149-66.
  • Fidel,R. (1991) "Searchers' selection of search keys: controlled vocabulary or free-text searching." Journal of American Society of Information Science August; 42(7):501-14.
  • Frohmann, Bernd. (1990). "Rules of Indexing: A Critique of Mentalisms in Information Retreival Theory." Journal of Documentation 46(2):81-101.
  • Indexing Systems. (1992) In Rowley, Jennifer. Organizing Knowledge. Brookfield, Vermont: Ashgate, chap. 17, pp.279-303.
  • Fugmann, Robert. (1994). Representational Predictability: Key to the Resolution of Several Pending Issues in Indexing and Information Supply. In Albrechtsen, Hanne and Oernagar, Sussane. (Eds.) Knowledge Organization and Quality Management: Proceedings of the Third International ISKO Conference 20-24 June 1994, Copenhagen, Denmark (pp 101-108). Frankfurt, Germany: Indeks Verlag. pp 414-422.
  • II. Data Warehousing & Mining (Seagle)
  • Readings:
  • To be announced
  • February 18, 1998

  • I. Other Database Models (Hierarchical, Network, and Object-Oriented) (Seagle)
  • Readings:
  • To be announced
  • II. Indexing Systems (Continued) (Iyer)
  • February 25, 1998

  • I. Automatic Indexing & Classification (Gangolly)
  • Readings:
  • van Rijsbergen, Chapters 1,2, and 3.
  • Articles by Joyce & Needham (pp.15 - 20), Doyle (pp.25 - 38), Salton & Lesk (pp.60 - 84), Salton & Buckley (pp.323 - 328), Sparck Jones (pp.329 - 338), Salton & Buckley (pp.355 - 364), and Griffiths, Luckhurst, & Willett (pp.365 - 374) from the Sparck Jones & Willett book.
  • Recommended: Chapter 9 in Salton's book.
  • II. The Object Model (Seagle)
  • Readings:
  • To be announced
  • March 4, 1998

  • I. Automatic Indexing & Classification (Continued) (Gangolly)
  • II. Object-Oriented Database Systems I. (Seagle)
  • Readings:
  • To be announced
  • March 11, 1998

  • I & II. Classification & its Role in Retrieval (Iyer)
  • Readings:
  • Hjorland, Birger. Nine Principles of Knowledge Organization. In Albrechtsen, Hanne and Oernagar, Sussane. (Eds.)/ Knowledge Organization and Quality Management: Proceedings of the Third International ISKO Conference 20-24 June 1994, Copenhagen, Denmark (pp 101-108). Frankfurt, Germany: Indeks Verlag. pp.91-100.
  • Jacob, Elin K. (1994). Classification and Crossdisciplinary Communication: Breaching the Boundaries imposed by Classificatory Structures. In Albrechtsen, Hanne and Oernagar,Sussane. (Eds.)/ Knowledge Organization and Quality Management: Proceedings of the Third International ISKO Conference 20-24 June 1994, Copenhagen, Denmark (pp 101-108). Frankfurt, Germany: Indeks Verlag.
  • Lakoff, George. (1990).Women, Fire and Dangerous Things: What Categories Reveal about the Mind. Chicago: The University of Chicago Press, pp.1-57.
  • Lakoff, George. (1987). Cognitive Models and Prototype. In Neisser, Ulric, (Ed.), Concepts and Conceptual Development, pp 63-100.
  • March 18, 1998

  • Spring Break (No Class)
  • March 25, 1998

  • I. Automatic Indexing III and Introduction to Retrieval & The Vector-Space Model in Information Retrieval (Gangolly)
  • Readings:
  • van Rijsbergen (pp.268 - 272), Salton, Wong, and Yang (pp.273 - 280), Salton, Allan, Buckley & Singhal (pp.478 - 483) from the Sparck Jones & Willett book.
  • Recommended: Chapter 10 in the Salton's book.
  • II. The Object Model (Continued) (Seagle)
  • Readings:
  • To be announced
  • April 1, 1998

  • I & II. The Vector-Space Model in Information Retrieval (Continued) (Gangolly)
  • April 8, 1998

  • I & II. Probabilistic Models of Information Retrieval (Gangolly)
  • Readings:
  • van Rijsbergen, Chapter 6.
  • Turtle & Croft (pp.287 - 298), Croft & Harper (pp.339 - 344) papers in the Sparck Jones & Willett book.
  • April 15, 1998

  • I & II. Natural Language Based models of Information Retrieval (Gangolly)
  • Readings:
  • Rau (pp.527 - 533), and Johnson, Paice, Black, & Neal (pp.538 - 552) papers from the Sparck Jones & Willett book.
  • Salton's book, Chapter 11.
  • April 22, 1998

    I & II. Geographic Information Systems (Mower/Pipkin)

    April 29, 1998

    Student presentation of papers/projects (Gangolly)

    May 6, 1998

    Student presentation of papers/projects (Gangolly)



    Updated on January 6, 1998 by gangolly@cnsunix.albany.edu