标签:
| orc | orc (split 110M) | parquet +snappy | parquet +gzip | |
| spark-sql 1.4 | 2mins, 7sec | 1mins,40sec | Parquet does not support decimal | Parquet does not support decimal |
| spark-sql 1.6 | 1mins, 30sec | 大概1mins,4sec | 大概1mins,4sec | 大概1mins,4sec |
| hive | 20mins | 18.5mins | 大概20mins | 大概20mins |
| 所占空间(raw倍数) | 1 | 1 | 1.6 | 1 |
spark-sql 1.6保持分配600G的内存不变,在不同数据量下进行测试:
|
|
200G
|
550G
|
1.1T
|
|---|---|---|---|
| spark-sql 1.4 | 11-12mins | ||
| spark-sql 1.6 | 7-8mins | 22mins | 51mins |
| hive | 15mins | 50mins | 将近5T内存,就没测试 |
3) 听单
|
|
time
|
|---|---|
| spark-sql 1.6 | 190s |
| hive | 1117s |
4)
三,总结
标签:
原文地址:http://www.cnblogs.com/shoudi/p/5564119.html