about the project

Polylingual Hybrid Question Answering

A recent survey estimates that more than 3.7 billion humans use the internet every day and produce nearly 2.5 quintillion bytes of data on the Web each day. Devising means to get insights from large amounts of multilingual data is a key component for the successful use of AI-powered solutions in companies. The difficulty of this endeavor is particularly significant within the European context, where data (both textual and structured) is commonly available in a multitude of languages.
PORQUE aims to ultimately facilitate the development of polylingual conversational AIs to enable end users to query large amounts of multilingual textual data. The major innovation of our approach lies in the combination of automatic machine translation with knowledge graphs (KGs) created on demand as the lingua franca for our system. The final output of PORQUE will be a novel polylingual hybrid QA framework capable of dealing with distinct domains while remaining multilingual and scalable. Data-driven applications that are based on PORQUE will thus be easier to interact with and thus lead to a higher return on investment. With the current uptake of KGs, the number of potential customers for the services developed on top of PORQUE promise to grow rapidly over the next few years. The project is motivated by this important language drawback in the market which we plan to address with the PORQUE technologies.

Why we do this

Data underpin business-critical decisions in modern companies. E.g. facts extracted from news influence financial markets by inducing 280% higher trade volume and 180% higher price change within 10 minutes of availability (Fedyk, 2018; https://bit.ly/2OYbfXQ). PORQUE will enable companies to ensure they can query across ALL DATA available in natural language (question-answering) including facts obtained from data suppliers before taking business-critical decisions based thereon.

What we do

PORQUE will develop and evaluate an extensible platform for polylingual question answering by relying on knowledge graphs and text. We will combine neural machine translation with language-specific natural language processing techniques to make answering queries across languages efficient and accurate. We will deliver a prototype, which will be evaluated through commercial solutions in three domains (open data, publishing, energy).