欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

MySQL: Charset and Collation 博客分类: MySQL MySQLCharsetCollationGarbledData Loss 

程序员文章站 2024-02-24 18:55:04
...

1. Introduction

    1) create table table_name (column_declaration) charset utf8;

    2) set names gbk;

    Comments:

    1) What's the meaning?

    2) What's the difference?

 

2. Charset

    1) Charset hierarchy: (Server>Database>Table>Column)

        1) Server default charset

        2) Database default charset

        3) Table default charset

        4) Column default charset

    2) Charset hierarchy policy:

        1) If we didn't declare the charset for a specific level, then its charset inherit from its parent's.

        2) If we didn't declare the charset for server, the server can never start.

    3) Comprehension for Translator:

        1) Client/Console have its own charset        

        2) Translator has its own charset

        3) Database has its own charset 
MySQL: Charset and Collation
            
    
    博客分类: MySQL MySQLCharsetCollationGarbledData Loss 
 

    4) Translator:

        1) Translator has to know the input data's charset      ---> In the figure above, the charset from client/console is gbk.                        ---> character_set_client = gbk;

        2) Translator has to know the transit data's charset    ---> In the figure above, the charset for transit data is utf-8 marked as red.     ---> character_set_connection = utf8;

        3) Translator has to know the database/table charset ---> In the figure above, the charset for database is utf-8.                                ---> create table table_name (column_delcaration) charset utf8;

        4) Translator has to know the output data's charset    ---> In the figure above, the charset for output is gbk.                                       ---> character_set_results = gbk;

        Comments: If character_set_client, character_set_connection, character_set_results is the same value of N. Then we can use "set names N" for short.

    

    5) When will garbled occurs?

        1) Character_set_client is not according to the truth. The data input from console is in charset of gbk. If we declared character_set_client = utf8, garbled occured.

        2) Character_set_results is not according to the truth. The data output to webpage is in charset of utf8. If we declared character_set_client = gbk, garbled occured.

 

    6) When will data loss?

        1) Character_set_connection/database-charset is smaller than the charset of data passed from client.

        Eg: gbk->lartin1->gbk: During the procession of translating from client to transit data, data loss!

              gbk->gbk->lartin1: Durint the processon of translating from transit data to database, data loss!

    7) Real world problem:

         1) For some reason, the data store in database as charset gbk and cannot be modified.

         2) Data passed from client is php with charset of utf8.

         3) Solution: set names utf8;  crate table table_name(column_declaration) charset gbk;

MySQL: Charset and Collation
            
    
    博客分类: MySQL MySQLCharsetCollationGarbledData Loss 

 3. Collation

    1) Introduction

# Create table
create table temp(name varchar(12));
# Insert data
insert into temp values('a'), ('B'), ('c'), ('D');
# Order data
select * from temp order by name asc;
+------+
| name |
+------+
| a    |
| B    |
| c    |
| D    |
+------+
# Q: a->97, B->66. Why a < B?
# A: Refer to collation.

    2) What is collaton?

        1) In order to order data in a table according to a column, we must specify a rule for this. And the rule is just the collation.

    3) What is the relationship between charset and collation?

        1) One charset may have many collations.

# Command for show collation
show collation

# Command for show collation for utf8
show collation like 'utf8%'
# utf8 has about 40 collations.

         2) Default collation for utf8 is 'utf8_general_ci': Is case insensitive.

                                                       'utf8_bin': Order by binary code.(ASCII Code)

# Create table
create table temp2(name varchar(11)) charset utf8 collate=utf8_bin;
# Insert data
insert into temp2 values('a'), ('B'), ('c'), ('D');
# Order data
select * from temp2 order by name asc;
+------+
| name |
+------+
| B    |
| D    |
| a    |
| c    |
+------+

 

  • MySQL: Charset and Collation
            
    
    博客分类: MySQL MySQLCharsetCollationGarbledData Loss 
  • 大小: 14.7 KB
  • MySQL: Charset and Collation
            
    
    博客分类: MySQL MySQLCharsetCollationGarbledData Loss 
  • 大小: 39.5 KB