[Pushed] Bad words filter is not working in 1.8.17
#11
(2018-07-23, 10:49 AM)linguist Wrote: Just asking: since they basically _look_ the same, did you make sure that in the cyrillic string the <je> are Serbian cyrillic letters not latin letters? They look alike, but have different Unicode points and will thus be treated as different in a database search etc.:
Cyrillic: u0435 u0458  : је
Latin:  u006A u0065 :  je

If you want to be 100% safe, you'd need to have four patterns to exclude, because people use these lookalike letters all the time to circumvent filters:
<jeb> (all Latin)
<јеb>  (Cyrillic је plus Latin b)
<јеб> (all Cyrillic)
<jeб> (Latin je, Cyrillic b)

Sorry, for deleyd reply. They are similar, its true, but not the same. Thanks.

I appriciate recent changes in bad words filter, but bad words filter is not working in 1.8.18 for Serbian cyrillic letters.

Hope so that this will be corrected in 1.8.19 version.
Prevod za najnoviju verziju foruma je u nekoj od mojih novijih poruka. >>link<<
Reply
#12
(2018-09-02, 07:07 PM)vojislavradoja Wrote: Sorry, for deleyd reply. They are similar, its true, but not the same. Thanks.

Of course they are not the same. That's the point. They are assigned distinct code points in character mappings, not only in UTF-8.
But: Depending on the font on a user's system, they may look different or not. Cyrillic and Latin small <e> and <o> for example are often indistinguishable in many sans serif fonts. This is a known vector for spoofs and phishing attempts.
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)