Cassandra一致性日志分析
接上一篇关于Cassandra一致性的讨论,见:
通过对Cassandra源码添加若干日志,今天应该是找到了问题的根本原因。共三台机器10.130.24.90,10.130.24.91,10.130.24.143
问题原因分析:
日志按照请求的先后顺序截取。表格的主键为(empID, deptID)
日志1:
请求被发送到91机器,要求插入first_name为eran10003的数据,可以看到时间戳为1392953766795000,取到了first_name为eran10003
DEBUG [Thrift:22] 2014-02-21 11:36:06,795 CassandraServer.java (line 1916) execute_cql3_query:[INSERT INTO employees (empID, deptID, first_name, last_name) VALUES (111, 222, 'eran10003', 'landau');] with clever:QUORUM
TRACE [Thrift:22] 2014-02-21 11:36:06,795 QueryProcessor.java (line 97) Process [email protected] @CL.QUORUM
INFO [Thrift:22] 2014-02-21 11:36:06,795 StorageProxy.java (line 801) destination:/10.130.24.90
INFO [Thrift:22] 2014-02-21 11:36:06,795 StorageProxy.java (line 822) insert to different server
INFO [Thrift:22] 2014-02-21 11:36:06,796 StorageProxy.java (line 831) direct writes to local DC
TRACE [Thrift:22] 2014-02-21 11:36:06,796 MessagingService.java (line 615) /10.130.24.91 sending MUTATION to [email protected]/10.130.24.90
INFO [Thrift:22] 2014-02-21 11:36:06,796 StorageProxy.java (line 801) destination:/10.130.24.143
INFO [Thrift:22] 2014-02-21 11:36:06,796 StorageProxy.java (line 822) insert to different server
INFO [Thrift:22] 2014-02-21 11:36:06,796 StorageProxy.java (line 831) direct writes to local DC
TRACE [Thrift:22] 2014-02-21 11:36:06,796 MessagingService.java (line 615) /10.130.24.91 sending MUTATION to [email protected]/10.130.24.143
INFO [Thrift:22] 2014-02-21 11:36:06,796 StorageProxy.java (line 801) destination:/10.130.24.91
INFO [Thrift:22] 2014-02-21 11:36:06,797 StorageProxy.java (line 817) insertLocal
DEBUG [MutationStage:112] 2014-02-21 11:36:06,797 Keyspace.java (line 351) RowMutation:RowMutation(keyspace='contactks', key='0000006f', modifications=[java.nio.HeapByteBuffer[pos=0 lim=10 cap=12]: timestamp:1392953766795000,java.nio.HeapByteBuffer[pos=0 lim=20 cap=38]:eran10003 timestamp:1392953766795000,java.nio.HeapByteBuffer[pos=0 lim=19 cap=36]:landau timestamp:1392953766795000,])
DEBUG [MutationStage:112] 2014-02-21 11:36:06,797 Keyspace.java (line 359) Appending to commitlog
DEBUG [MutationStage:112] 2014-02-21 11:36:06,797 Keyspace.java (line 375) Adding to employees memtable
DEBUG [Thread-3] 2014-02-21 11:36:06,798 MessagingService.java (line 697) Message received from /10.130.24.90, id :29474
DEBUG [RequestResponseStage:312] 2014-02-21 11:36:06,798 MessageDeliveryTask.java (line 61) doVerb:REQUEST_RESPONSE
DEBUG [Thrift:22] 2014-02-21 11:36:06,798 Tracing.java (line 157) request complete
日志2:
请求被发送到90机器,要求插入first_name为eran10004的数据,可以看到时间戳为1392953753016000 ,取到了first_name仍然为eran10003
DEBUG [Thrift:16] 2014-02-21 11:35:53,016 org.apache.cassandra.thrift.CassandraServer (line 1916) execute_cql3_query:[INSERT INTO employees (empID, deptID, first_name, last_name) VALUES (111, 222, 'eran10004', 'landau');] with clever:QUORUM TRACE [Thrift:16] 2014-02-21 11:35:53,016 org.apache.cassandra.cql3.QueryProcessor (line 97) Process [email protected] @CL.QUORUM INFO [Thrift:16] 2014-02-21 11:35:53,016 org.apache.cassandra.service.StorageProxy (line 801) destination:/10.130.24.90 INFO [Thrift:16] 2014-02-21 11:35:53,016 org.apache.cassandra.service.StorageProxy (line 817) insertLocal INFO [Thrift:16] 2014-02-21 11:35:53,017 org.apache.cassandra.service.StorageProxy (line 801) destination:/10.130.24.143 INFO [Thrift:16] 2014-02-21 11:35:53,017 org.apache.cassandra.service.StorageProxy (line 822) insert to different server INFO [Thrift:16] 2014-02-21 11:35:53,017 org.apache.cassandra.service.StorageProxy (line 831) direct writes to local DC TRACE [Thrift:16] 2014-02-21 11:35:53,017 org.apache.cassandra.net.MessagingService (line 615) /10.130.24.90 sending MUTATION to [email protected]/10.130.24.143 INFO [Thrift:16] 2014-02-21 11:35:53,017 org.apache.cassandra.service.StorageProxy (line 801) destination:/10.130.24.91 INFO [Thrift:16] 2014-02-21 11:35:53,018 org.apache.cassandra.service.StorageProxy (line 822) insert to different server INFO [Thrift:16] 2014-02-21 11:35:53,018 org.apache.cassandra.service.StorageProxy (line 831) direct writes to local DC TRACE [Thrift:16] 2014-02-21 11:35:53,018 org.apache.cassandra.net.MessagingService (line 615) /10.130.24.90 sending MUTATION to [email protected]/10.130.24.91 DEBUG [MutationStage:117] 2014-02-21 11:35:53,017 org.apache.cassandra.db.Keyspace (line 351) RowMutation:RowMutation(keyspace='contactks', key='0000006f', modifications=[java.nio.HeapByteBuffer[pos=0 lim=10 cap=12]: timestamp:1392953753016000,java.nio.HeapByteBuffer[pos=0 lim=20 cap=38]:eran10004 timestamp:1392953753016000,java.nio.HeapByteBuffer[pos=0 lim=19 cap=36]:landau timestamp:1392953753016000,]) DEBUG [MutationStage:117] 2014-02-21 11:35:53,018 org.apache.cassandra.db.Keyspace (line 359) Appending to commitlog DEBUG [MutationStage:117] 2014-02-21 11:35:53,019 org.apache.cassandra.db.Keyspace (line 375) Adding to employees memtable DEBUG [Thread-8] 2014-02-21 11:35:53,020 org.apache.cassandra.net.MessagingService (line 697) Message received from /10.130.24.91, id :28705 DEBUG [Thread-10] 2014-02-21 11:35:53,020 org.apache.cassandra.net.MessagingService (line 697) Message received from /10.130.24.143, id :28704 DEBUG [RequestResponseStage:82] 2014-02-21 11:35:53,021 org.apache.cassandra.net.MessageDeliveryTask (line 61) doVerb:REQUEST_RESPONSE DEBUG [RequestResponseStage:83] 2014-02-21 11:35:53,021 org.apache.cassandra.net.MessageDeliveryTask (line 61) doVerb:REQUEST_RESPONSE DEBUG [Thrift:16] 2014-02-21 11:35:53,021 org.apache.cassandra.tracing.Tracing (line 157) request complete
观察上述结果,发现新插入的数据:eran10004,没有成功获取,但是请求确实已经发送到了服务器,并且已经被执行。究竟是什么原因导致的呢?熟悉Cassandra数据存储特点的一定都知道,Cassandra的每一列都是分版本存储的,即插入一列新数据,Cassandra都要赋予其一个时间戳作为版本号,如果主键一致,那原来的数据并不删除,而是直接把新数据覆盖在旧数据上,并按时间戳排序;每次查询都会获取时间戳最大的那一个。
结合日志发现eran10003的时间戳为1392953766795000,大于eran10004的时间戳:1392953753016000 。于是查询的时候,Cassandra就认为eran10003是最新的,而eran10004也就获取不到了。
时间戳是服务器本地时间,91的时间大于90的时间所以出现了问题。
解决方案:
1、使用Ntp同步91和90及143的时间。
2、时间戳有客户端控制而不在服务端,如果是多客户端,客户端之间还要同步时间。
当然还需要使用Consistency Level: QUORUM。