Niedersächsische Staats- und Universitätsbibliothek Göttingen Niedersächsische Staats- und Universitätsbibliothek Göttingen
Eine Person arbeitet am Laptop. Auf dem Bildschirm ist eine Textdatei geöffnet. SUB Göttingen

Text and Data Mining

The SUB Göttingen participates in text and data mining projects, developing tools for natural language processing and providing text resources and TDM tools.

MONAPipe

MONAPipe stands for "Modes of Narration and Attribution Pipeline" and offers natural language processing tools for the German language and is implemented in Python/spaCy. In addition to the components provided by spaCY, MONAPipe offers specific components and models for Digital Humanities and Computational Literary Studies.

MONAPipe was originally created in the MONA project group and is now being further developed within the Text+ infrastructure. 

More Information

Find out more on the Text+ website.

Website Text+

MINE - Text Mining Service for digital resources

The MINE project aims to pool text resources that are available on the Göttingen campus or provided by partners around the world. The service then allows full-text search and search via metadata, which also includes results from text and data mining tools. These results are also made available in a knowledge graph.

It is developing a service infrastructure for text and data mining (TDM), which will be transferred to a campus service at the end of the project. The aim is to provide researchers and digital services with simple and direct access to TDM tools and text resources. MINE not only enables the search for existing data and metadata, but also enriches the metadata with prepared TDM tools. The enriched results are stored in a knowledge graph that provides new and unique ways to explore the available resources.

Currently, the service offers searches in approximately 7 million data sets from various data sources, which are combined in a normalized data model. The technical infrastructure, which is currently under development, is constantly being expanded with new tools and additional text resources.

You can access the prototype at https://mine-graph.de/. Some functions are only available on the Göttingen Campus.

MINE provides various REST endpoints that other systems can use. There is a Python client library and an Orange widget to integrate text resources into your own pipelines or tools. If you have further questions or would like to get full access, please contact the MINE team ( mine-team@gwdg.de ).

MINE is a collaboration between the Göttingen State and University Library (SUB Göttingen) and the Gesellschaft für wissenschaftliche Datenverarbeitung Göttingen (GWDG).

MINE is being developed in collaboration with the Gesellschaft für wissenschaftliche Datenverarbeitung Göttingen (GWDG).

If you have any further questions or would like to obtain full access, please contact the MINE team.