Regex woes…

I have come up with this regex:

^.*[[:<:]]T[[:alnum:]]* [[:<:]]P[[:alnum:]]*.*$

(yes, it’s old-style ereg syntax for use in a MySQL statement)

In PCRE syntax it’d look like:

/^.*\bT\w* \bP\w*.*$/i

It will match ‘T P’, ‘Tony P’, Tony Parsons’, ‘Some Tiny People’…

So far so good. Unfortunately I need to find a way to tell it NOT to match certain words, such as ‘The’.

i.e. it should match ‘Theremin Player’ and ‘The Tree Party’, but not ‘The Pastors’ or ‘Tony The Pony’.

If I was doing this in PHP it’d be easier, but since it’s in a MySQL statement I need to somehow perform that operation all in one (old-style…) regex. And I’m stumped.

I need to do something like:

word-left-boundary,
followed by t, followed by zero or more word characters (but not he),
word-right-boundary

If anyone knows how to do this it’d be a big help!

Advertisements

About anentropic
songwriter, musician, and er web programmer...

One Response to Regex woes…

  1. blueskiwi says:

    /^.*\bT(?(?=(he\b|rio\b))_|\w*) \bP\w*.*$/i

    Well, this regex does what I want – it will match ‘Thermo Plastic’ but not ‘The Peacock’. It will also match ‘Trionomic Pleasantries’ but not ‘Trio Plasm’, since I have a whole list of words I don’t want to match not just ‘The’.

    My problem now is this PCRE regex won’t work in MySQL…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: