Natural Language Processing and Understanding

Components integrated through the UIMA framework


General Architecture for Text Engineering  is a suite of tools used  for  natural language processing tasks, including information extraction in many languages.


TensorFlow, Google developed open-source neural network framework including SyntaxNet that provides a foundation for Natural Language Understanding systems.

Facebook Fast Text

FastText is a neural network based library of machine learning algorithms for scalable solutions of text representation and classification developed by Facebook Research.

Alchemy / Watson

AlchemyLanguage is a collection of natural language processing APIs that help you detect entities and high level concepts.



DBpedia Spotlight is a tool for automatically annotating mentions of DBpedia resources in text, providing a solution for linking unstructured information sources to the Linked Open Data cloud through DBpedia.

Video Image & Voice


Visual recognition technology. Automatically tags visual content.


Speech recognition solutions.


Transcribe, index and analyse audio & video content.

Data Management


A triplestore or RDF store is a purpose-built database for the storage and retrieval of triples through semantic queries. A triple is a data entity composed of subject-predicate-object, like “Bob is 35” or “Bob knows Fred”.


Neo4j is a highly scalable native graph database that leverages data relationships as first-class entities, helping enterprises build intelligent applications to meet today’s evolving data challenges.

Apache Tinker Pop

Apache TinkerPop™ is a graph computing framework for both graph databases (OLTP) and graph analytic systems (OLAP).



Solr is the popular, blazing-fast, open source enterprise search platform built on Apache Lucene™.

Elastic Search

Elasticsearch is a search server based on Lucene. It provides a distributed, multitenant-capable full-text search engine with a RESTful web interface and schema-free JSON documents.

Analytics & Visualization


Zoomdata is a next generation data visualization and analytics system optimized for real-time and historical big data backends.


Kibana is an open source data visualization plugin for Elasticsearch. It provides visualization capabilities on top of the content indexed on an Elasticsearch cluster.

W3C Standards

Web Annotation Data Model

The Web Annotation Data Model specification describes a structured model and format to enable annotations to be shared and reused across different hardware and software platforms.

Web Annotation Vocabulary

The Web Annotation Vocabulary specifies the set of RDF classes, predicates and named entities that are used by the Web Annotation Data Model.

