MyBB Community Forums

Full Version: RegEx problems
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
So I'm working on a plugin that finds URLs in posts and changes the link URL and the text of the link originally typed in as well, but I've run into a few problems when using the "parse_message_end" hook, and with my current RegEx code.

Typing in www.google.com or http://www.google.com returns this:
mysite.com/page.php?url=http://www.google.com" target="_blank
which is almost right, but obviously I don't want the last quotation and the whole target= etc...

Basically I just need it to cut off the last quotation mark and the whole target=_blank code. Also I need it to actually be linked, not just plaintext, which it is currently. Any solution ideas? I'm assuming the problem is in my RegEx find code, which is:
'/<a href=\"((https?|ftps?):\/\/(.*?))\">(.*?)<\/a>/is'

Thanks
your code is really hard to read. I think this is much easier (and technically identical):
#<a href="((https?|ftps?)://(.*?)">(.*?)</a>#is

didn't try it, but you can replace the . with [^"] (=anything but quote).
#<a href="((https?|ftps?)://([^"]*?)">(.*?)</a>#is
(2011-10-09, 04:40 PM)patrick Wrote: [ -> ]your code is really hard to read. I think this is much easier (and technically identical):
#<a href="((https?|ftps?)://(.*?)">(.*?)</a>#is

didn't try it, but you can replace the . with [^"] (=anything but quote).
#<a href="((https?|ftps?)://([^"]*?)">(.*?)</a>#is

Thanks for the attempt, but neither worked Sad. The first one gave me the same problem, the second didn't do anything at all.

Also both pieces of code have an extra parenthesis after href=".
you're right about the parenthesis Blush

The reason the 2nd one doesn't work, is because it stops at target=

try this:
#<a href="((https?|ftps?)://([^"]*?)).*?>(.*?)</a>#is
(2011-10-09, 07:02 PM)patrick Wrote: [ -> ]you're right about the parenthesis Blush

The reason the 2nd one doesn't work, is because it stops at target=

try this:
#<a href="((https?|ftps?)://([^"]*?)).*?>(.*?)</a>#is

Using that I now get
http://mysite.com/page.php?url=http://
when I enter http://www.google.com
my mistake. It should be greedy. You have to leave out the ?
#<a href="((https?|ftps?)://([^"]*)).*?>(.*?)</a>#is

this should work as well:
#<a href="((https?|ftps?)://(.*?))".*?>(.*?)</a>#is
the .*? after " is what makes you catch without getting the target=_blank
(2011-10-09, 08:17 PM)patrick Wrote: [ -> ]my mistake. It should be greedy. You have to leave out the ?
#<a href="((https?|ftps?)://([^"]*)).*?>(.*?)</a>#is

this should work as well:
#<a href="((https?|ftps?)://(.*?))".*?>(.*?)</a>#is
the .*? after " is what makes you catch without getting the target=_blank

Awesome! The first one worked! Now, the next part of my problem now that that's solved, is actually making that text a URL; right now it's just plaintext. What I need it to do is be basically
[url=http://mysite.com/page.php?url=http://originalremovedURL.com]http://originalremovedURL.com[/url]
I've tried adding in the url tags manually, but they just show up as plaintext as well, instead of actually turning the text into a URL.

How would I go about doing all that? (again thanks for helping, my RegEx skills are nonexistent Sad)
What are you trying to do? You might not even need regex.
so basically you want:
$text = str_replace('<a href="http://', '<a href="http://mysite.com?http://', $text)
$text = str_replace('<a href="ftp://', '<a href="http://mysite.com?ftp://', $text);

that's much, much, MUCH faster than using regex (never benchmarked on PHP, but using C# the difference is huge).