MyBB Community Forums

Full Version: PHP Help
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
Hello all,

I have another dilemma here...I've brought it up before but it remains unsolved.

I have a custom searchform that allows you to enter search-criteria and with a tickbox define against which of the customfields the search is done.
Pretty easy thing to do once you just wrap the searchterm with %%....so no worries there...

Now...I have a number of fields like LASTNAME, FIRSTNAME etc. that do have diacritic characters...and these are not found with the search.

An example.... person's lastname is Mäki....now someone searches for him by using term Mäki...everything is fine....should you search for Maki or Maeki or anything like that....no go.

Soo.....at my work I've gone around this by using TRANSLATE('àèìòùỳ','aeiouy') and that's fine....but that's a function available in Hyperion suite.....not in regular SQL I believe.

Now is there anyone that can give me the code that I need to implement in my form that will turn at least the chars with the singlequote or doublequote on top, into their 'quoteless' equivalents (à=>a) on both the searchform as well as in the sql??? So basically....the code would translate on the fly both the search-term as well as the field in the database into it's 'quoteless' version.

So if someone searches for this person called Mäki, the system would bring up all records that meet criteria %Maki% or %Mäki%....the problem is that it should really be then I suppose also %Makì% or %Mäkì% or one of the many permutations.....shite...

anyone have a magic-formula/regexp/code to fix this?

cheers,
Kimmo
What's the charset/coalition used on the tables you have in MySQL? If a matching characterset for php->mysql connection, tables, and that of forms is used, then you shouldn't have any problems.

I recommend using UTF-8 all over.

Btw, is that finnish language? Which character set you have been using?
I have still 1.1.8 so it's whatever it was by default at that time....8859-1 I believe...

for reasons that are too long to go into...I can not go to 1.2.x yet....
So you are searching MyBB posts and topics and want matches like this:

Enter: Maki

and it matches: Mäkì, Mäki, Maki

Or Enter: Mäkì
and it matches: Mäkì, Maki, Makì

Is it? Or are you merely trying to convert the text input Mäkì to Maki, so that if user types 'Mäkì' in the textbox, the script uses 'Maki' to search in the MySQL tables?

Either way, I think something like this should work. It's case-insensitive, and accent-insensitive as far as I know, but sometimes it may cause unexpected results too, since it uses the SOUNDEX() algorithm and maybe it may consider Maky as a result too.

SELECT * FROM table WHERE table_fields SOUNDS LIKE 'Mäkì'

Then after the data is returned, you can use some php code to remove accents of the original search term, the returned results, and then match and see if the term really exists in result. I am not sure of any better way when using MySQL. As I remember, last time I used CONVERT() function to convert fields, it wasn't really effective.

function remove_accent($title) 
{
    // from my Spicefuse SEO plugin (not released)
    // translate accents -- and maybe remove few non-ascii characters too, while on it..
    $title = preg_replace('#&([a-zA-Z])(circ|uml|grave|tilde|acute|cedil|ring|th);#', '\\1', htmlentities($title, null, 'utf-8'));
    $title = preg_replace("/&([a-z]+);/", "", $title);
    return $title;
}

$search = 'Mäkì';
$search = strtolower(remove_accent($search));
// some loop returning results
while (......) 
{
      if (!strstr(strtolower(remove_accent($result['data'])), $search)) {
            // ignore this as it doesn't contain the search term .. 
            continue;
      }
}
Alternatively you could use regular expressions (if the soundex idea doesn't work)

I'm not too sure if this will work - and if it does, how quick it will be but..
$search_terms = preg_replace('#\^\$\.\*\+\?\|\(\)\{\}\-\=\:\<\>#', '\\$1', $search_terms); // Add quotes to any regexp terms which may be in the search terms

// Our list of terms
$search_replacements = array(
	"à" => "(a|à)",
	"è" => "(e|è)"
);

// Replace to build regexp
$search_terms = str_replace(array_keys($search_replacements), $search_replacements, $search_terms);

// Bingo
$query = $db->query("SELECT * FROM users WHERE name REGEXP '{$search_terms}'");
hey this is great! Now I wish I knew how to implement it!!! Sad

If either of you two, or anyone who knows how would be so kind to implement either of these methods (preferably the faster since I've got 20,000 record usertable and this is meant for usertable), on the attached file I would appreciate it!!

cheers,
Kimmo
uhh thats a lot of searching there... You want it only for the username or for all those fields?
Asad_Niazi Wrote:uhh thats a lot of searching there... You want it only for the username or for all those fields?

well it's actually just a single search-field....it's a form we have and it just allows to define against which field the search is done.
I can send you a private message with the login-details in a pm as you have to be registered member in our community in order to use the search.

Actually the code in the file I've given here probably hints towards the code that we have customized that allows admin to specify whether a field is 'searchable' at the time of creation of the customfield. If marked yes, it will show up as a tickbox on this form.

cheers,
Kimmo
May give it a try .. I think the utf8 character set can be accent-insenstive in itself, and maybe a conversion can help. File attached, with change on line 103.
"ERROR:
Could not execute SQL statement to get result count"

with this file Sad
Pages: 1 2