Extracting parts of a message (using regular expressions?)

Hi All,

I am trying to create a bot to help novice programmers to, amongst other things, decode compiler error messages. The idea is that they could copy and paste the error message and be offered help on what it really means and how to fix it. I’d like to extract some information from the message and use in in the response.


Chatbot: Paste your error here
User: ERROR on line 17: Variable counter has not been assigned a value.
Chatbot: This error means that somewhere on line 17 you are trying to use a
variable called "counter", but the computer has not seen this variable 
before and so doesn't know about it.
Chatbot: Possible Solution: If you think you have created this variable and
put a value in it, then check your spelling of the variable name.
Example of this error
number = 4
number = Number + 1
In this case the computer sees Number as a different variable to number
Chatbot: Possible Solution: Make sure that the variable has a value before
you use it. This might be as simple as setting a score to 0 before a game
Example of this error
score = score + 1
In this case, the variable score has not been set a value. The computer tries
to evaluate score + 1, but gets stuck because it doesn't know what 
score is

Most chatbots, including botpress, seem to work well with training and NL data, but this seems more like a job for regular expressions or similar. I have looked at the patterns in the NLU unit, but they seem to match the whole pattern rather than being able to extract part of it.

Any ideas about how to approach a project like this? Code it in a custom action?

I have a lot of development experience, but this would be my first botpress bot.


That’s a cool use case.

Yes, patterns entities in Botpress NLU are extracted as whole patterns. If you really want to use pattern entities, I suggest you can create different entities with your sub patterns you want to match.

But since parsing those code snippets wont really be natural language, I suggest you create a beforeInfoming hook and handle the parsing from there.

Let me know about your progress, I think that’s an interesting idea.

Thanks for the encouragement.

So, to make this generic and simple to update, I don’t really want to hard code the regular expressions. I have checked, and I can add groups into the pattern.

My current idea is to use an afterincoming hook to check if a matching message has been detected. What I then want to do is get the original regex which will then be used to extract groups (as many as there are) into variables (a variable, an entity? - but named regex_param1, regex_param2 etc) and then trigger an intent of the same name as the hook. I would do this by doing a search and replace on the regexp - i.e. rewrite it to $1 and store in regex_param1, then rewrite it to $2 and store in regex_param2.

This would mean by inserting a regex and intent with the same name, I can update the bot just through the GUI.

The main issue at the moment is getting hold of the original regex. At the moment I am looking into using getFileAsObject to read in the pattern definitions.

Sound like a sound approach? Am I going 3 sides round the square anywhere?

In the long term, would this be better as a module?



Update on this - now mostly working. In the end I added an action to do the extraction rather than a hook. Using a hook meant that I didn’t have access to the temp variable - so I couldn’t see how to pass the data back to the bot.

At the moment I am reading the matched entity file in as an object and using this to extract the pattern. I would love to know if I can get at the pattern without having to read in the json.

I can now detect two different error messages and flow to relevant cards. I am looking at using jumpTo rather than having to flow to each one manually.

The regex matching is all done first, so if there is no match then it will use NLU to answer general questions about programming.