Translating natural language to database language (or SQL to NLP)

People don’t speak the database language. Machines speak Structured Query Language (SQL) while humans speak…well, English as an example. Nevertheless, we all want to use data to make better decisions which means that we need to overcome major hurdles such as tedious dashboards or SQL scripts (for those of us who know SQL)

One way to solve it would be to convert natural language to SQL. Yep, just like Google Translate is translating Japanese to English.

At the core of Nibi’s technology, we understand the user and the data and we bridge the communication gap. Meaning, you can ask a question in your own words and we know how to translate it to SQL (or any other form of database language) that is relevant to your data and we do it on the fly.

To get it right, we need to solve two virtually very different challenges: – on the database side – we understand the scheme, values, type of the data, and more important, how people might ask questions on the date. On the other hand (the users’ front) – we need to understand the words, parts of the sentence and how they all relate to each other.

Then, once we get these two parts right, our core technology understands each part of the sentence the user is typing and map it to our intermediate language (Nibi Query Language) which is dynamically developed based on the data schema itself. Meaning, we develop a third language that evolves and can be matched to any type of database. Boom.

For those of you who are in the NLP business, you are probably familiar with ln2SQL and SQLNet which are both open sources that are trying to tackle the same challenge. The main issue with these two is that they are too narrow. To build a product that people would use, the solution needs to be rich as our language. These solutions are very limited and inflexible. That doesn’t work.

Our approach from day one was to build a robust and reach engine that leverages the best of all the technologies out there – Artificial Intelligence combined with rule-based NLP that our experts have created. To get it right, we are building our own parser and extending existing open source solutions (such as Spacy) to create the first building block. Then, moving to the AI phase, we partnered with several design partners and focused our efforts on a relatively small set of datasets that are tapped by a high number of questions per set (vs. WikiSQL which is taking the exact opposite approach as it is based on fairly simple questions that are asked on data from Wikipedia) to improve our engine. We think that by taking this approach (vs. the more common WikiSQL), we build an NLP to SQL engine that can handle more complex and deep questions and that relate to various and complex data sets. We have already seen this in action.

If you want to hear more, please reach out to us.

Leave a Reply

Close Menu