Regex expression not behaving as expected

I want to test if a string contains a word or not. So, I have this regex expression:/\bde\b/gi

And, if my string is "Comida de cão", it works.

But, if I have a string like "Necessidade de adeus depois " it also matches the "de" in "necessidade", "adeus" and "depois".

Besides, when I try to match words with accents in a string like "é a vida", using the regex like this: /\bé\b/gi nothing is found. But if I search for a word with an accent in the middle it is found! So in the string "O nível" if I use the following regex expression /\bnível\b/gi it matches the right word.

I've been searching similar issues but I still didn't manage to solve my problem.

Btw, here the first issue doesn't happen and it works as expected.

Thanks!

Edit: Added my code

var myRe = new RegExp("\\b" + query + "\\b","iu");
var match = myRe.test("Necessidade de adeus depois");

Answers:

Answer

The closest to a working thing that I have found is this. Like stated in my comment, there seem to be a problem with word boundaries and unicode characters.

This solution can be improved i think, but it uses a positive lookahead (that doesn't consume the characters) to test either if start ^ or end $ of string, or if not a word character:

//accent as a word end or start
/(?=^|\W)é(?=$|\W)/giu

//no accent as a word end or start
/\bnível\b/giu

EDIT: yes that's true, does not work with multiple chars.. if you can test the length of what you want to test, you can still make different cases depending if you search for 1 or multiple chars

EDIT2: actually last edit is wrong. It doesn't depend on the length but if the accented char is near the boundary or not. so it would be /(?=^|\W)éternel\b/giu for "éternel" and /\bné(?=$|\W)/giu for "né"

updated regex example: https://regex101.com/r/6v2gId/3

EDIT3: a little example of what i tried, to answer your last comment:

var query = 'de';
var myRe = new RegExp("\\b" + query + "\\b","giu");
var match = myRe.test("determinado de necessidade de comer é de");
document.getElementById('res1').innerHTML = match;
var match = myRe.test("determinado necessidade comer é e");
document.getElementById('res2').innerHTML = match;
var query = 'dé';
var myRe = new RegExp("\\b" + query + "(?=$|\\W)","giu");
var match = myRe.test("déterminado dé necessidadé de comer é de");
document.getElementById('res3').innerHTML = match;
var match = myRe.test("déterminado necessidadé comer é de");
document.getElementById('res4').innerHTML = match;
<span>test with "\\bde\\b":</span><br/>
<span>for "determinado de necessidade de comer é de":</span><span id="res1"></span><br/>
<span>for "determinado necessidade comer é e":</span><span id="res2"></span><br/><br/>
<span>test with "\\bdé(?=$|\\W)":</span><br/>
<span>for "déterminado dé necessidadé de comer é de":</span><span id="res3"></span><br/>
<span>for "déterminado necessidadé comer é de":</span><span id="res4"></span>

Tags

Recent Questions

Top Questions

Home Tags Terms of Service Privacy Policy DMCA Contact Us Javascript

©2020 All rights reserved.