Register

If this is your first visit, please click the Sign Up now button to begin the process of creating your account so you can begin posting on our forums! The Sign Up process will only take up about a minute of two of your time.

Results 1 to 9 of 9
  1. #1
    Senior Member filburt1's Avatar
    Join Date
    Jul 2002
    Location
    Maryland, US
    Posts
    11,774
    Member #
    3
    Liked
    21 times
    I'm working on a site (for my job) where users can post text in a WYSIWYG editor. The user usually types it up in Word, then copies and pastes it in the editor. Remember that Word does things like smart quotes (curly quotes instead of straight quotes), em dashes, etc. That totally screws up the output. For example, it will look like:
    SKYDEX™ does more with less. Impacts can be absorbed in less space allowing for more comfort. SKYDEX™ provides an optimum balance of cushioning, flexibility and weight.
    The problem looks to be unicode characters being put into an ASCII character space (the table originally had a collation of latin1_swedish, now it's utf8_general). How can I fix this, either with an SQL query to replace the old data or server-side parsing of data from the database?

    It doesn't have to apply to future posts because they've been trained not to use smart quotes, etc.
    filburt1, Web Design Forums.net founder
    Site of the Month contest: submit your site or vote for the winner!

  2.  

  3. #2
    Senior Member Stylise's Avatar
    Join Date
    Jul 2005
    Location
    Mount Martha, Australia
    Posts
    229
    Member #
    10679
    I've recently ran into this after workin on an XCDP project, where the XML feed was UTF8 and I unware had my database set to default, latin1_swedish.

    Are you using PHP? What you have to do is make sure PHP is using UTF8 (you can check this by using var_dump to output the user input before inserting into the database).

    A simple,
    PHP Code:
    ini_set("default_charset""UTF-8"); 
    will do the job, should be able to just edit the php.ini file directly too, but don't always get the choice.

    If it's ok up to this point, then it's the database that's doing it. You need to set the DB connection to use UTF8, but make sure to only do this once, otherwise you'll be encoding the chars more than once.

    So, just after you select your database, after you connect, use:

    PHP Code:
    mysql_query("SET NAMES 'utf8'"); 
    That way when you go to insert/update it will use that charset. For some reason it must be forced.


    Hope this helps!

  4. #3
    Senior Member filburt1's Avatar
    Join Date
    Jul 2002
    Location
    Maryland, US
    Posts
    11,774
    Member #
    3
    Liked
    21 times
    Didn't work, just outputted the same thing:
    PHP Code:
        ini_set("default_charset""UTF-8"); 

        require_once(
    $_SERVER['DOCUMENT_ROOT'] . "/lib/init.php");
        
        
    mysql_query("SET NAMES utf8") or die(mysql_error());
        
        
    $result mysql_query("SELECT text FROM pressbox_news WHERE id = 38") or die(mysql_error());
        
    $row mysql_fetch_row($result);
        echo 
    $row[0]; 
    In case I wasn't clear before, the problem is that there is already data in the table that is screwed up. This isn't to control new data going into the table.
    filburt1, Web Design Forums.net founder
    Site of the Month contest: submit your site or vote for the winner!

  5. #4
    Senior Member Stylise's Avatar
    Join Date
    Jul 2005
    Location
    Mount Martha, Australia
    Posts
    229
    Member #
    10679
    Ahh ok, sorry I mistook you. If it's already in the database as the special characters, then I'm not entirely sure how you'd go about changing them over, other than doing a multiple str_replace over all the records. But that's far from ideal.

  6. #5
    Senior Member
    Join Date
    Jun 2005
    Location
    Atlanta, GA
    Posts
    4,146
    Member #
    10263
    Liked
    1 times
    At a glance, it looks like this article might be of use to you. I didn't have time to do much more than skim it, though.

  7. #6
    Senior Member filburt1's Avatar
    Join Date
    Jul 2002
    Location
    Maryland, US
    Posts
    11,774
    Member #
    3
    Liked
    21 times
    It's crazy how such a seemingly simple problem as UTF-8 text can wreak so much havoc.
    filburt1, Web Design Forums.net founder
    Site of the Month contest: submit your site or vote for the winner!

  8. #7
    Senior Member filburt1's Avatar
    Join Date
    Jul 2002
    Location
    Maryland, US
    Posts
    11,774
    Member #
    3
    Liked
    21 times
    Fixes it perfectly:
    PHP Code:
        if (!headers_sent())
        {
            
    header("Content-type: text/html; charset=utf-8; language=en");
        } 
    filburt1, Web Design Forums.net founder
    Site of the Month contest: submit your site or vote for the winner!

  9. #8
    Senior Member
    Join Date
    Jun 2005
    Location
    Atlanta, GA
    Posts
    4,146
    Member #
    10263
    Liked
    1 times
    Hehehe. Awesome.

  10. #9
    WDF Staff Wired's Avatar
    Join Date
    Apr 2003
    Posts
    7,656
    Member #
    1234
    Liked
    137 times
    filburt coding vBulletin @ work? not surprised
    The Rules
    Was another WDF member's post helpful? Click the like button below the post.

    Admin at houseofhelp.com


Remove Ads

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
All times are GMT -6. The time now is 12:10 AM.
Powered by vBulletin® Version 4.2.3
Copyright © 2019 vBulletin Solutions, Inc. All rights reserved.
vBulletin Skin By: PurevB.com