On August 7-8, 2014, a text research workshop entitled “Exploiting Text” will be held at the University of Waterloo to celebrate the career and achievements of Professor Frank Wm. Tompa, who officially retired from the University of Waterloo on 1 December 2013.
Workshop Overview
Knowledge is most often captured and communicated through natural language text and preserved as digital documents. Documents are amassed into curated digital libraries or loosely bound into searchable repositories, such as the World Wide Web. With today’s widespread adoption of social media, documents covering all possible topics are created, stored, and shared in increasing numbers by amateurs and hobbyists, as well as by experts and scholars.
For many years, we have been developing tools for searching through document collections, extracting data, and summarizing selected sub-collections. The goal of this workshop is to explore directions in which we can make profitable advances in these areas and in which we can further exploit the text resources available in these collections.
We hope to explore applications that exploit large reference texts (such as the Oxford English Dictionary, the National Library’s Early Canadiana Online, or Wikipedia), extremely large corpora (including the Web or corporate intranets and extranets), open government data (such as Canada’s Open Data initiatives, and humanists’ research needs (such as the Margot project). Topics of interest include:
- Organization, storage, and management of large reference texts and curated document repositories
- Search techniques and search engine technology
- Resource discovery and dissemination
- Browsing, text mining, information extraction, summarization, and visualization
Organizers
- Raymond Ng, University of British Columbia, Vancouver BC
- Glenn Paulley, Conestoga College, Kitchener ON
- Ken Salem, University of Waterloo, Waterloo ON
- Charlie Clarke, University of Waterloo, Waterloo ON
- David DeHaan, SAP Labs, Waterloo ON