Sebastian Hellman - DCLRS Seminar - 14th June 2012

Video Category: 
Research Seminar Talk
sebastian_hellman.jpg

Title: A Transparent Formalization of Text for Machines

Abstract: The talk will have three parts: At the beginning I will introduce my research group and give an overview. Then I will introduce the LOD2 Stack. Finally, I would like to talk about the NLP Interchange Format: The NLP Interchange Format (NIF) is an RDF/OWL-based format that aims to achieve interoperability between Natural Language Processing (NLP) tools, language resources and annotations. The motivation behind NIF is to allow NLP tools to exchange annotations about text documents in RDF. Hence, the main prerequisite is that parts of the documents (i.e. strings) are referenceable by URIs, so that they can be used as subjects in RDF statements. The String Ontology, which is the basis for NIF, fixes the referent (i.e. a string in a given text) of annotations unambiguously for machines and thus enables the creation of heterogeneous, distributed and loosely coupled NLP applications, which use the Web as an integration platform. We evaluate the String Ontology based on the adequacy of the collected requirements and furthermore by benchmarking the stability of the NIF URI scheme in a Web annotation scenario. I will also talk shortly about Web annotation and show some recent demos (http://annotateit.org/ , http://sourceforge.net/projects/fragmentlinks/ , http://www.w3.org/community/openannotation/wiki/TextCommentOnWebPage).

Biography: Sebastian Hellmann (http://bis.informatik.uni-leipzig.de/SebastianHellmann) obtained his master degree in 2008 from the University of Leipzig, where he is currently researching as a PhD Student in the Agile Knowledge Engineering and Semantic Web (http://aksw.org) research group. He is founder, leader or contributor of several open source projects, including DL-Learner, DBpedia and NLP2RDF. Among his research interests are light-weight ontology engineering methods, data integration and scalability in the Web of Data. Sebastian is author of over 10 peer-reviewed scientific publication and (co-) chaired the Open Knowledge Conference 2011, the Linked Data Cup 2012, the Linked Data in Linguistics Workshop 2012 (http://ldl2012.lod2.eu/) and the Web of Linked Entities Workshop 2012 (http://wole2012.eurecom.fr/).