Noriko Tomuro, PhD.
Contact Information
College of Computing
and Digital Media (CDM),
DePaul University
243 S. Wabash Ave.
Chicago, IL 60604 U.S.A.
Tel: +1 312 362 5218
Email: tomuro@cs.depaul.edu
Web: http://condor.depaul.edu/~ntomuro/
Research Interests
Natural Language Processing (NLP), Lexical Semantics, Information
Extraction, Machine Learning, Text Mining and Analytics, Data Mining.
Research Summary:
My research interests are Natural Language Processing (NLP) and Machine Learning
(ML), with a particular focus on the development of NLP algorithms/techniques
which exploit lexical characteristics of domain text. In the past I have
developed a Unification-based parsing algorithm in which semantic information is
integrated with syntax, and a clustering algorithm which resolves semantic
ambiguity of polysemous words. I have also developed techniques to derive
and visualize concept hierarchies from adjectives. In
recent years I have been working with domain experts to apply NLP techniques in
other fields. I worked with researchers in Game Studies and analyzed game
reviews to help verify their research hypotheses. I have also
worked with medical research analysts to process and analyze radiology reports.
My latest work is on Privacy -- I worked with experts on Privacy to build an
automatic privacy policy summarization system (a practical application).
I have also supervised several doctoral thesis research,
some of which are in the area of Image Processing (e.g. automatic classification
of mammograms).
Education
Professional Employment History
- Associate Professor, DePaul University,
Fall 2005 to present.
- Assistant Professor, DePaul University,
Fall 2000 to Spring 2005.
- Visiting Assistant Professor, DePaul University,
Fall 1999 to Spring 2000.
- Lecturer, DePaul
University, Fall 1996 to Spring 1999.
- Taught programming and data structure courses (in
C++).
- Research Internship, NEC Research, NJ.
Summer 1994.
- Generated English and Japanese lexicon for
Principle and Parameter Parser from a
broad-coverage machine readable dictionary EDR.
- Advisor:
Sandiway Fong
- Programmer, Furkon Inc., Chicago, IL. May 1993 to January
1994.
- Developed client-server applications using
Motif/X-Windows and embedded SQL in a unix
network environment.
Honors and Awards
- Graduate Assistance in Areas of National Need (GAANN). The U.S.
Department of Education.
- Capacity: Co-Director.
- Awarded 2012 ($659,625), 2007
($633,360), 2000 ($432,855).
-
DePaul University Quality of Instruction Council (QIC) grant.
- Capacity: Principal Investigator.
- Awarded 2012-13 (paid leave),
2011($3,350), 2008 ($3,500).
-
DePaul University Research Council (URC) grant.
- Capacity:
Principal Investigator.
- Awarded 2010 ($3,498), 2007 ($2,550).
- Graduate Assistance in Areas of National Need (GAANN).
- Capacity: Recipient
(student). September 1995 to August 1998.
Publications
Refereed Journal Articles
- Zagal, J., Tomuro, N. and Shepitsen, A. (2011).
"Natural Language Processing for Games Studies Research". Journal
of Simulation & Gaming (S&G), Special Issue on Games Research Methods.
http://gamesmethods.wordpress.com
- Tomuro, N. and Lytinen, S. (2004).
"Retrieval Models and Q&A Learning with FAQ Files"
(a book chapter). New Directions in Question Answering, p.
183-194. AAAI Press / The MIT Press.
- Tomuro, N. (2004).
"Question Terminology and Representation for Question Type Classification".
Journal of Terminology, 10 (1), pp. 153-168. - Tomuro, N. and
Lytinen, S. (2001).
"Nonminimal Derivations in Unification-based Parsing".
Computational Linguistics, Vol. 27 (2001), Number 2. - Burke, R.,
Hammond, K., Kulyukin, V., Lytinen, S., Tomuro, N. and Schoenberg, S. (1997).
"Question Answering from Frequently Asked Question Files: Experiences with
the FAQFinder System". AI Magazine, Summer, 18 (2), pp. 57-66.
Refereed Conference & Workshop Papers
- Tomuro, N., Lytinen, S. and Hornsburg, K. (2016).
"Automatic Summarization of Privacy
Policies using Ensemble Learning". In Proceedings
of the ACM Conference on Data and Application Security and Privacy (CODASPY
2016), New Orleans, LA. Best Poster Award.
- Pourashraf, P. and Tomuro, N. (2015).
"Use
of a Large Image Repository to Enhance Domain Dataset for Flyer
Classification". In Proceedings of the 11th International
Symposium on Visual Computing (ISVC'15),
Las Vegas, NV.
- Zhang, Y., Tomuro, N., Furst, J. and Raicu, D. (2015).
"Combining
Edge-based and Region-based Segmentations for Diagnosing Masses in
Mammograms". In Proceedings of the 29th International Congress of
Computer Assisted Radiology and Surgery (CARS
2015), Barcelona, Spain.
- Pourashraf, P., Tomuro, N. and Apostolova, E. (2015).
"Genre-based
Image Classification Using Ensemble Learning for Online Flyers". In
Proceedings of the 7th international Conference on Digital Image Processing
(ICDIP 2015), Los Angeles, CA.
- Zhang, Y., Tomuro, N., Furst, J. and Raicu, D. (2015).
"Identifying
the Optimal Segmentors for Mass Classification in Mammograms". In
Proceedings of the SPIE Symposium on Medical Imaging (SPIE
Medical Imaging 2015), Orlando, FL.
- Apostolova, E. and Tomuro, N. (2014).
"Combining
Visual and Textual Features for Information Extraction from Online Flyers".
In Proceedings of the 2014 Conference on Empirical Methods in Natural
Language Processing (EMNLP 2014), Doha,
Qatar.
- Tomuro, N., Tanaka, S. and Zagal, J. (2014).
"Developing
Soft Skills the Hard Way: International Student Game Projects".
RePlaying Japan 2014 (https://sites.google.com/a/ualberta.ca/replayingjapan2014/).
- José P. Zagal and Noriko Tomuro (2013).
"Cultural
differences in game appreciation: A study of player game reviews". In
Proceedings of the 8th International Conference on the Foundations of Digital
Games (FDG-13). Best Paper Award.
- Kleper, V., Tomuro, N., Apostolova, E., Lytinen, S., Wang, L. and Mongkolwat,
P. (2013).
"Linking Radiology Reports to the Corresponding Pathology Reports Based on
Biopsy Recommendations in Radiology Reports, Using Unified Medical Language
System Concept Unique Identifiers". In Proceedings of the Annual Meeting of
The Society for Imaging Informatics in Medicine (SIIM 2013),
Grapevine-Dallas, TX. Best Poster Award.
- Raison, K., Tomuro, N., Lytinen, S. and Zagal, J. (2012).
"Extraction of User Opinions by Adjective-Context Co-clustering for Game
Review Texts". In Proceedings of the 8th International Conference on Natural
Language Processing (JapTAL 2012),
Kanazawa, Japan.
- Kleper, V., Tomuro, N., Apostolova, E., Lytinen, S., Wang, L. and Mongkolwat,
P. (2012).
"Automated Extraction of Mammography Pathological Recommendations from
Mammography Radiological Reports and Linking Each Radiological Report to the
Associated Pathological Reports". In Proceedings of the 98th Scientific
Assembly and Annual Meeting of the Radiological Society of North America (RSNA
2012), Chicago, USA.
- Apostolova, E., Tomuro, N., Mongkolwat, P. and Demner-Fushman, D. (2012).
"Domain Adaptation of Coreference Resolution for Radiology Reports".
In Proceedings of the 11th workshop on Biomedical Natural Language Processing (BioNLP
2012), Montreal, Canada.
- Apostolova, E., Tomuro, N. and Demner-Fushman, D. (2011).
"Automatic Extraction of
Lexico-Syntactic Patterns for Detection of Negation and Speculation Scopes". In Proceedings of the 49th Annual Meeting of the
Association for Computational Linguistics: Human Language Technologies
(ACL/HLT-11), Portland, OR.
- Zhang, Y., Tomuro, N., Furst, J. and Raicu, D. (2011).
"Building an Ensemble System for Diagnosing Masses in Mammograms". In
Proceedings of the 25th International Congress of Computer Assisted Radiology
and Surgery (CARS 2011), Berlin, Germany.
- Zhang, Y., Tomuro, N., Furst, J. and Raicu, D. (2011).
"Building Multiple weak segmentors for strong mass segmentation in mammogram". In
Proceedings of the SPIE Symposium on Medical Imaging (SPIE-2011).
- Apostolova, E. and Tomuro, N. (2010).
"Exploring Surface-Level Heuristics for Negation and Speculation Discovery in
Clinical Texts". In Proceedings of the 9th workshop on Biomedical
Natural Language Processing (BioNLP 2010), Uppsala, Sweden.
- Zagal, J., Tomuro, N. and Shepitsen, A. (2010).
"Natural Language Processing for Games Studies Research".
Games
Research Methods seminar. - Apostolova, E., Neilan, S., An, G., Tomuro,
N. and Lytinen, S. (2010).
"Dangology: A Light-weight Web-based Tool for Distributed Collaborative Text
Annotation". In Proceedings of the 7th International Conference on
Language Resources and Evaluation (LREC 2010).
- Zhang, Y., Tomuro, N., Furst, J. and Raicu, D. (2010).
"Image Enhancement and Edge-based Mass Segmentation in Mammogram".
In Proceedings of the SPIE Symposium on Medical Imaging (SPIE-2010).
- Tomuro, N. and Shepitsen, A. (2009).
"Construction of Disambiguated Folksonomy Ontologies
Using Wikipedia". In Proceedings of the workshop
on The People's Web Meets NLP: Collaboratively Constructed Semantic Resources at
the Association for Computational Linguistics (ACL-09).
- Shepitsen, A. and Tomuro, N. (2009).
"Improving Diversity and Relevancy of E-commerce Recommender Systems Through NLP
Techniques". In Proceedings of the IADIS e-Commerce (EC
2009) Conference.
- Shepitsen, A. and Tomuro, N. (2009).
"Personalized Search in Folksonomies with Ontological
User Profiles". In Proceedings of the International Joint
Conference Intelligent Information Systems (IIS
2009).
- Shepitsen, A. and Tomuro, N. (2009).
"Search in Social Tagging Systems Using Ontological User Profiles".
In Proceedings of the 3rd Int'l AAAI Conference on Weblogs and Social Media (ICWSM
2009).
- Tomuro, N. and Lytinen, S. (2008).
"Polysemy in Lexical Semantics -- Automatic Discovery of Polysemous Senses and
Their Regularities". NYU
Symposium on Semantic Knowledge Discovery, Organization and Use. -
Kanzaki, K., Tomuro, N. and Isahara, H. (2008).
"The \"Close-Distant\" Relation of Adjectival Concepts Based on Self-Organizing
Map". In Proceedings of the workshop on Cognitive Aspects of the
Lexicon at the 22nd International Conference on Computational Linguistics (Coling
2008).
- Kanzaki, K., Bond, F., Tomuro, N. and Isahara, H. (2008).
"Extraction of Attribute Concepts from Japanese Adjectives". In
Proceedings of the Sixth International Language Resources and Evaluation (LREC'08).
- Tomuro, N., Lytinen, S., Kanzaki, K. and Isahara, H. (2007).
"Clustering Using Feature Domain Similarity to Discover Word Senses for
Adjectives". In Proceedings of the 1st IEEE International
Conference on Semantic Computing (ICSC-2007).
- Tomuro, N., Kanzaki, K. and Isahara, H. (2007).
"Discovering Word Senses for Polysemous Words Using Feature Domain Similarity".
In Proceedings of the Conference of the Pacific Association for Computational
Linguistics (PACLING-2007).
- Kanzaki, K. Tomuro, N. and Isahara, H. (2007).
"Extraction and Organization of Abstract Concepts that Categorize Adjectives
From Corpora". In Proceedings of the 4th International Workshop on
Generative Approaches to the Lexicon (GL-2007).
- Tomuro, N., Kanzaki, K. and Isahara, H. (2007).
"Self-organizing Conceptual Map and Taxonomy of Adjectives". In
Proceedings of the 18th Midwest Artificial Intelligence and Cognitive Science
Conference (MAICS 2007). - Tomuro, N. (2003).
"Interrogative reformulation patterns and acquisition of question paraphrases".
In Proceedings of the International Workshop on Paraphrasing (IWP?03) at
ACL?2003, Sapporo, Japan. - Tomuro, N. (2002).
"Question Terminology and Representation for Question Type Classification".
In Proceedings of the 2nd International Workshop on Computational Terminology
(COMPUTERM02), held at COLING-02, Taipei, Taiwan.
- Lytinen, S. and Tomuro, N. (2002).
"The Use of Question Types to Match Questions in FAQFinder". In
Papers from the 2002 AAAI Spring Symposium on Mining Answers from Texts and
Knowledge Bases, pp. 46-53. - Tomuro, N. and Lytinen, S.
(2001).
"Selecting Features for Paraphrasing Question Sentences". In
Proceedings of the Workshop on
Automatic Paraphrasing at Natural Language Processing Pacific Rim
Symposium (NLPRS 2001), Tokyo, Japan.
- Tomuro, N. and Lytinen, S. (2001).
"Abstract Left-corner Parsing for Unification Grammars". In
Proceedings of the Natural Language Processing Pacific Rim Symposium (NLPRS
2001), Tokyo, Japan. -
Tomuro, N. (2001).
"Tree-cut and A Lexicon based
on Systematic Polysemy".
In Proceedings of the North American Chapter of the
Association for Computational Linguistics (NAACL2001).
- Tomuro, N.
(2001).
"Systematic Polysemy and Inter-annotator Disagreement: Empirical Examinations".
In Proceedings of the first International Workshop on Generative
Approaches to Lexicon.
- Lytinen, S., Tomuro, N. and Repede, T. (2000).
"The Use of WordNet Sense
Tagging in FAQFinder". In Proceedings of the workshop on
Artificial Intelligence for Web Search at the 17th National Conference
on Artificial Intelligence (AAAI-2000), Austin, TX.
- Tomuro, N. (2000).
"Automatic Extraction of Systematic Polysemy Using Tree-cut". In
Proceedings of the workshop on
Syntactic and Semantic Complexity in Natural Language Processing Systems
at Language Technology Joint Conference, Applied Natural Language Processing and
the North American Chapter of the Association for Computational Linguistics
(ANLP-NAACL2000), Seattle, WA, pp. 20-27.
- Tomuro, N., Alkoby, K., Berthiaume, A., Chomwong, P., Davidson, M.,
Furst, J., Konie, B., Lancaster, G., Lytinen, S., McDonald, J., Roychoudhuri,
L., Toro, J. and Wolfe, R. (2000).
"An Alternative Method for Building A Database for American Sign Language".
In Proceedings of the conference on Technologies for Persons with
Disabilities (CSUN2000), Los Angeles, CA.
- Tomuro, N. (1998).
"Semi-automatic Induction of Systematic Polysemy from WordNet". In
Proceedings of the workshop on
Usage of WordNet in Natural Language Processing Systems at the 17th
International Conference on Computational Linguistics (COLING-98) and the 36th
Annual Meeting of the Association for Computational Linguistics (ACL-98),
Montreal, Canada, pp. 108-114.
- Tomuro, N. (1998).
"Semi-automatic Induction of Underspecified Semantic Classes". In
Proceedings of the workshop on
Lexical Semantics in Context: Corpus, Inference and Discourse at the
10th European Summer School in Logic, Language and Information (ESSLLI-98),
Saabruecken, Germany.
- Burke, R., Hammond, K., Kulyukin, V., Lytinen, S., Tomuro, N. and
Schoenberg, S. (1997).
"Natural Language Processing in the FAQFinder System: Results and Prospects".
In Papers from the 1997 AAAI Spring Symposium on Natural Language Processing
for the World Wide Web, pp. 17-26.
- Tomuro, N. (1996).
"Maximizing Top-down Constraints for Unification-based Systems".
In Proceedings of the 34th Annual Meeting of the Association for Computational
Linguistics (ACL-96), Santa Cruz, CA. pp. 381-383.
- Lytinen, S. and Tomuro, N. (1996).
"Left-corner Unification-based Natural Language Processing".
In Proceedings of the 13th National Conference on Artificial Intelligence
(AAAI-96), Portland, OR, pp. 1037-1043.
- Lytinen, S. and Tomuro, N. (1995).
"Steps Toward Real-time Natural Language Processing". In
Proceedings of the 17th Annual Conference of the Cognitive Science Society,
Pittsburgh, PA, pp. 666-670.
Tech Reports
Conference Organization
Workshop Organization
- 3rd Workshop on
"Games and NLP" (GAMNLP-14), October 2014, Raleigh, NC.
- Capacity: Main chair and organizer.
- Planned and executed all
aspects of the workshop (with two co-chairs).
- 2nd Workshop on "Games
and NLP" (GAMNLP-13), November 2013, Istanbul, Turkey.
- Capacity: Main chair and organizer.
- Planned and executed all aspects of the
workshop (with co-chair).
-
1st Workshop on
"Games and NLP" (GAMNLP-12), September 2012,
Kanazawa, Japan.
- Capacity: Main chair and organizer.
- Created
the workshop. Managed all aspects of the workshop (with co-chair), including CFP, formation of program
committee, review assignment and local arrangements.
Conference Program Committee
Teaching Experience
Courses Taught
- CSC 215 Intro to Programming in C++ (undergraduate programming intro)
- CSC 310 Data Structures in C++ I (undergraduate data structures intro)
- CSC 312 Data Structures in C++ II (undergraduate data structures
intermediate)
- CSC 211 Intro to Programming in Java I (undergraduate programming intro
I)
- CSC 212 Intro to Programming in Java II (undergraduate programming intro
II)
- CSC 383 Data Structures in Java (undergraduate data structures intro)
- CSC 393 Data Structures in C++ (undergraduate data structures intro)
- CSC 224 Java for Programmers (accelerated Java intro for graduate
prerequisite)
- CSC 309 C++ for Programmers (accelerated C++ intro for undergraduate)
- CSC 404 Accelerated C++ (accelerated C++ intro for graduate
prerequisite)
- CSC 415 Foundation of Computer Science I (discrete mathematics)
- CSC 416 Foundation of Computer Science II (data structures intro for
graduate prerequisite; C++ and Java)
- CSC 417 Foundation of Computer Science III (data structures intermediate
for graduate prerequisite; C++)
- CSC 578 Neural Networks and Machine Learning (graduate AI)
- Re-developed the course. Lectures on major concepts in NN and ML, and student individual projects.
- CSC 594 Topics in Artificial Intelligence (graduate AI)
- Topic: "Applied Natural Language Processing"
- Created the course. Term-long group projects on various NLP tasks
and applications, including game reviews analysis.
- Topic: "Text Mining and Analytics"
- Created the course. Lectures on basic concepts of Text Mining; Use
of SAS Enterprise Miner and Python/NLTK.
HCI 201 Multimedia and the World Wide Web (undergraduate web page
development for non-majors)
IT 398 Topics in Global Information Technology
- Topic: "Computer Gaming and Animation in Japan" (in
conjunction with study abroad)
PhD Thesis Supervision & External Committee Member
- Yu Zhang, PhD Computer Science (graduated 2011) -- Capacity: Thesis
advisor.
- Title: "Classification on Masses as Benign or Malignant in
Mammogram Images"
- Emilia Apostolova, PhD Computer Science (graduated 2011) -- Capacity:
Thesis advisor.
- Title: "Information Extraction from Radiology Reports for
the purpose of Automatic Semantic Annotation of Medical Images"
- Payam Pourashraf, PhD Computer Science (on-going) -- Capacity: Thesis
advisor.
- Title: "Multi-modal Classification" (tentative)
- Andriy Shepitsen, PhD Computer Science (stopped) -- Capacity: Thesis
advisor.
- Title: "Knowledge-based Information Retrieval and
Recommender Systems Using Domain Knowledge and Natural Language
Processing"
- Bill Horsthemke, PhD Computer Science (graduated 2010) -- Capacity:
Committee member.
- Xuchang Zou, PhD Computer Science (graduated 2009) -- Capacity:
Committee member.
- Bassam Hammo, PhD Computer Science (graduated 2002) -- Capacity:
Committee member.
- Dmitriy Zinovev, PhD Computer Science (stopped) -- Capacity: Committee
member.
Study Abroad Programs
- "CDM Japan: Computer Gaming and Animation in Japan"
- Capacity: Creator, Lead faculty.
- A short-term (2-week) study abroad trip which visits gaming and
animation companies in Japan.
- Led the trip (a group of 20+ students) four times so far (2005,
2008,
2011,
2013).
- Planned and organized virtually all detailed aspects of the trip,
including arranging visits with companies and various cultural
activities, and working with a travel agency.
- Also taught pre- and post-trip courses (two terms, in-class
sessions).
Student Group & Independent Project/Study Supervision (extra-curricular)
Independent project/study of a total of 25 students (MS/BS in Computer Science) since 2005.
- Capacity: Advisor.
- Most projects were for one-term, and the topics were on Natural
Language Processing or Machine Learning.
Student Organization Supervision (extra-curricular)
College &University Committees
- PhD committee (CDM). 1999 to present.
- In addition to the regular duties as a committee member, I took a
personal initiative in several activities, including:
- Created an internal research symposium for PhD students ("CDM
Research Symposium") in 1999, and organized annually (until
2012).
- Created an open-house event ("CTI PhD Conference") in
1998, and organized annually (until 2005).
- Designed and initiated a student progress tracking system on the
school's intranet in 2004, and maintained it (until 2012).
- Worked as the chair of the subcommittees for student travels and
PhD office space.
- C++ Review Committee (CDM). Nov 2010 to March 2012.
- Initiated the committee to review and re-organize the existing C++
courses.
- Gathered enrollment statistics, interviewed faculty teaching these
courses, drove discussions and chaired meetings.
- Academic Program Review Committee (DePaul). 2005 - 2011.
- Liberal Studies Scientific Inquiry Domain committee (DePaul). 2013
to present.