有一张很老的数据表,时间戳格式为varchar,字段如下:

id bigint
name varchar(200)
create_time varchar(20)
//索引
KEY `IDX_CREATED` (`create_time`),

数据约500多万,现在引出发现的问题,一条sql语句效率非常的低:

select name, create_time from t where create_time > 1434115807296 order by create_time limit 1000;

本机测试200s,执行计划:

> explain select name, create_time from t where create_time > 1434115807296 order by create_time limit 1000;

+----+-------------+-------+-------+---------------+---------------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key           | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+---------------+---------------+---------+------+------+-------------+
|  1 | SIMPLE      | User  | index |  IDX_CREATED  |  IDX_CREATED  |   63    | NULL | 1000 | Using where |
+----+-------------+-------+-------+---------------+---------------+---------+------+------+-------------+
1 row in set (0.00 sec)

如果去掉 limit :

select name, create_time from t where create_time > 1434115807296 order by create_time

执行时间5s,执行计划:

> explain select name, create_time from t where create_time > 1434115807296 order by create_time

+----+-------------+-------+------+---------------+------+---------+------+---------+-----------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows    | Extra                       |
+----+-------------+-------+------+---------------+------+---------+------+---------+-----------------------------+
|  1 | SIMPLE      | User  | ALL  |  IDX_CREATED  | NULL |   NULL  | NULL | 4858500 | Using where; Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+---------+-----------------------------+
1 row in set (0.00 sec)    

一个 index 查询竟然比 ALL&filesort 查询慢这么多? 用profiles看下详细信息

mysql> show profiles;
+----------+--------------+------------------------------------------------------------------------------------------------------------+
| Query_ID | Duration     | Query                                                                                                      |
+----------+--------------+------------------------------------------------------------------------------------------------------------+
|        1 |   0.00179100 | explain select name, created_at from user where created_at > 1434115807296 order by created_at limit 1000  |
|        2 | 195.10171300 | select name, created_at from user where created_at > 1434115807296 order by created_at limit 1000          |
|        3 |   5.28713500 | select name, created_at from user where created_at > 1434115807296 order by created_at                     |
+----------+--------------+------------------------------------------------------------------------------------------------------------+
2 rows in set, 1 warning (0.00 sec)

mysql> show profile for query 2;
+----------------------+------------+
| Status               | Duration   |
+----------------------+------------+
| starting             |   0.000050 |
| checking permissions |   0.000006 |
| Opening tables       |   0.000013 |
| init                 |   0.000017 |
| System lock          |   0.000005 |
| optimizing           |   0.000006 |
| statistics           |   0.000018 |
| preparing            |   0.000011 |
| Sorting result       |   0.000002 |
| executing            |   0.000002 |
| Sending data         | 195.101362 |
| end                  |   0.000011 |
| query end            |   0.000009 |
| closing tables       |   0.000012 |
| freeing items        |   0.000158 |
| logging slow query   |   0.000006 |
| cleaning up          |   0.000025 |
+----------------------+------------+
17 rows in set, 1 warning (0.00 sec)    

mysql> show profile for query 3;
+----------------------+----------+
| Status               | Duration |
+----------------------+----------+
| starting             | 0.000063 |
| checking permissions | 0.000007 |
| Opening tables       | 0.000029 |
| init                 | 0.000024 |
| System lock          | 0.000730 |
| optimizing           | 0.000014 |
| statistics           | 0.000021 |
| preparing            | 0.000010 |
| Sorting result       | 0.000003 |
| executing            | 0.000002 |
| Sending data         | 0.000006 |
| Creating sort index  | 5.286166 |
| end                  | 0.000012 |
| query end            | 0.000006 |
| closing tables       | 0.000011 |
| freeing items        | 0.000016 |
| cleaning up          | 0.000015 |
+----------------------+----------+
17 rows in set, 1 warning (0.00 sec)

查询1 基本所有的耗时都在 Sending Data 时期,查询3 的耗时在创建排序索引的时期,也就是 filesort 的耗时了,Sending Data是什么鬼玩意!

Sending data

The thread is reading and processing rows for a SELECT statement, and sending data to the client. Because operations occurring during this this state tend to perform large amounts of disk access (reads), it is often the longest-running state over the lifetime of a given query.

官方对这个状态解释的是处理查询的数据及返回给客户端,根据查询3能排除返回客户端耗时的原因。
问题很早就看出来了,create_time是varchar类型,但sql对比的是一个int类型,解决方案就是改为Int/Bigint类型。
但是慢的原因还未知,随后又增加了其他的测试,例如把name换成主键id,速度立马变成不到5秒

+----------------------+----------+
| Status               | Duration |
+----------------------+----------+
| starting             | 0.000061 |
| checking permissions | 0.000007 |
| Opening tables       | 0.000014 |
| init                 | 0.000019 |
| System lock          | 0.000006 |
| optimizing           | 0.000006 |
| statistics           | 0.000018 |
| preparing            | 0.000013 |
| Sorting result       | 0.000003 |
| executing            | 0.000002 |
| Sending data         | 4.531929 |
| end                  | 0.000009 |
| query end            | 0.000006 |
| closing tables       | 0.000010 |
| freeing items        | 0.000041 |
| cleaning up          | 0.000015 |
+----------------------+----------+
16 rows in set, 1 warning (0.00 sec)

可以看到Sending data时间大大降低,是因为id是索引字段?那么我给name也增加一个索引…
结果竟然依旧是200多s,这就奇怪了,难道是因为name的长度太大? 再新增一个sex char(2),并建好索引,
结果依旧一样!

mysql> show profiles;
+----------+--------------+------------------------------------------------------------------------------------------------------------+
| Query_ID | Duration     | Query                                                                                                      |
+----------+--------------+------------------------------------------------------------------------------------------------------------+
|        1 |   0.00179100 | explain  select name, created_at from user where created_at > 1434115807296 order by created_at limit 1000 |
|        2 | 195.10171300 | select name, created_at from user where created_at > 1434115807296 order by created_at limit 1000          |
|        3 |   5.28713500 | select name, created_at from user where created_at > 1434115807296 order by created_at                     |
|        4 |   4.53215900 | select id, created_at from user  where created_at > 1434115807296 order by created_at limit 1000           |
|        5 | 209.23518600 | select sex, created_at from user  where created_at > 1434115807296 order by created_at limit 1000          |
+----------+--------------+------------------------------------------------------------------------------------------------------------+

这个问题有待解决!

在实际使用中经常遇见Scala <=> Java之间的转换。例如要调用某个第三方包的方法需要一个Java的Map,那么可能用到Scala提供的JavaConverters或JavaConversions工具包
案例:

//待转换的Map
scala> val m = Map(1->2)
m: scala.collection.immutable.Map[Int,Int] = Map(1 -> 2)

JavaConversions示例:

import collection.JavaConversions._
scala> val conversionsMap = mapAsJavaMap(m).asInstanceOf[java.util.Map[Int, Int]]
conversionsMap: java.util.Map[Int,Int] = {1=2}

JavaConversters示例:

import scala.collection.JavaConverters._
scala> val convertersMap = m.asJava
convertersMap: java.util.Map[Int,Int] = {1=2}

俩者都转换成功,看似一模一样的效果,为什么共同存在?

scala> conversionsMap.put(3,4)
java.lang.UnsupportedOperationException
  at java.util.AbstractMap.put(AbstractMap.java:203)
  ... 33 elided      

  scala> convertersMap.put(3,4)
  java.lang.UnsupportedOperationException
  at java.util.AbstractMap.put(AbstractMap.java:203)
  ... 33 elided 

转换成功后的Map竟然是不可修改的,这好像不符合我的需求,猜测是因为m是immutable类型的原因,那么继续:

//先转换成mutable
val m2 = collection.mutable.Map(m.toSeq: _*)
scala> m2.+(3->4)
res4: scala.collection.mutable.Map[Int,Int] = Map(1 -> 2, 3 -> 4)
  //好,现在m2可以修改了

  scala> val conversionsMap = mapAsJavaMap(m2).asInstanceOf[java.util.Map[Int, Int]]
conversionsMap: java.util.Map[Int,Int] = {1=2}
scala> conversionsMap.put(3,4)
java.lang.UnsupportedOperationException
  at java.util.AbstractMap.put(AbstractMap.java:203)
  ... 33 elided
  //conversionsMap依旧不能修改,囧

scala> val convertersMap = m2.asJava
convertersMap: java.util.Map[Int,Int] = {1=2}
scala> convertersMap.put(3,4)
res5: Int = 0
scala> convertersMap
res6: java.util.Map[Int,Int] = {1=2, 3=4}
//convertersMap 可以修改

那么JavaConversions需要转换出可修改的Map该如何操作,搜索一番得到答案,JavaConversions提供的是Java集合之间的隐式转换,如果想让JavaConversions产出可变的Map应该如下操作:

//JavaConversions直接隐式的转换了
val conversionsMap2: java.util.Map[Int, Int] = m2
scala> conversionsMap2.put(3,4)
res7: Int = 0
scala> conversionsMap2
res8: java.util.Map[Int,Int] = {1=2, 3=4}

JavaConversions出现在V2.8版本,JavaConverters出现于V2.8.1,相比JavaConversions更新,并且显示的转换更易于理解,推荐使用JavaConverters

小说

《地球大炮》
《殷商玛雅征服史》
《时间之墟》
《王小波全集》(1-3)

计算机

《高性能Linux服务器构建实战》
《快学Scala》

历史

《费马大定理》
《战争就是这么回事》(1-3)

其他

《集装箱改变世界》
《时间的形状》
《牌局心理学》
《男人除了性还在想什么?》


很明显14年读书少了,太多的东西让人分心,又到了反思的时刻了。
因为14年入市,还看了一些投资相关的书,结论就是:都是放屁。总结起来就一句话:投资有风险,入市需谨慎。大部分讲的都是鸡蛋要分开放的各种道理,这类书还是少看为秒。

《集装箱改变世界》 这本书挺有意思,看书名完全没什么感觉,看后发现津津有味,似乎看到了世界变化的原理,明白一些世界定律,比如:技术往往不太重要,重要的是如何应用。算是个人年度推荐的唯一书籍了。

随着信息获取的方式增长,原来看纸质书,后来基本看电子书,现在电子书都缩减了,各种短篇微博&微信都能合集出书,似乎书的价值在越来越低,甚至都不用看书了,微信公众号,微博,社交平台…各种的信息占据了很大一部分时间。如果不使用这些,会丢失很多实时信息,似乎与当前社会有点脱节,很out。如何在这里面来快速获取有效信息越来越重要。

新的一年要好好规划,新年快乐!