regex - Python match word to word list after removing repeating characters -

i have list of words positive , negative sentiment e.g. ['happy', 'sad']

now when processing tweets i'm removing repeating characters (allowing 2 repetitions):

happpppyyy -> happyy  saaad -> saad

the check if e.g. saad part of word list should return true because similar sad.

how can implement behaviour?

i build regular expressions dynamically turning word:

happy

into

h+a+p+p+y+

pass list of "happy" words this:

import re  re_list = [re.compile("".join(["{}+".format(c) c in x])) x in ['happy', 'glad']]

then test (using any return true if happy regex matches:

for w in ["haaappy","saad","glaad"]:     print(w,any(re.match(x,w) x in re_list))

result:

haaappy true saad false glaad true

Brazee