基因数据处理114之BWA建立全基因组索引成功

运行记录

hadoop@Mcnode5:~/disk2/home/hadoop/xubo/ref/buildIndex$ bwa index GCA_000001405.15_GRCh38_full_analysis_set.fna 
[bwa_index] Pack FASTA... 33.14 sec
[bwa_index] Construct BWT for the packed sequence...
[BWTIncCreate] textLength=6418915856, availableWord=463658232
[BWTIncConstructFromPacked] 10 iterations done. 100000000 characters processed.
[BWTIncConstructFromPacked] 20 iterations done. 200000000 characters processed.
[BWTIncConstructFromPacked] 30 iterations done. 300000000 characters processed.
[BWTIncConstructFromPacked] 40 iterations done. 400000000 characters processed.
[BWTIncConstructFromPacked] 50 iterations done. 500000000 characters processed.
[BWTIncConstructFromPacked] 60 iterations done. 600000000 characters processed.
[BWTIncConstructFromPacked] 70 iterations done. 700000000 characters processed.
[BWTIncConstructFromPacked] 80 iterations done. 800000000 characters processed.
[BWTIncConstructFromPacked] 90 iterations done. 900000000 characters processed.
[BWTIncConstructFromPacked] 100 iterations done. 1000000000 characters processed.
[BWTIncConstructFromPacked] 110 iterations done. 1100000000 characters processed.
[BWTIncConstructFromPacked] 120 iterations done. 1200000000 characters processed.
[BWTIncConstructFromPacked] 130 iterations done. 1300000000 characters processed.
[BWTIncConstructFromPacked] 140 iterations done. 1400000000 characters processed.
[BWTIncConstructFromPacked] 150 iterations done. 1500000000 characters processed.
[BWTIncConstructFromPacked] 160 iterations done. 1600000000 characters processed.
[BWTIncConstructFromPacked] 170 iterations done. 1700000000 characters processed.
[BWTIncConstructFromPacked] 180 iterations done. 1800000000 characters processed.
[BWTIncConstructFromPacked] 190 iterations done. 1900000000 characters processed.
[BWTIncConstructFromPacked] 200 iterations done. 2000000000 characters processed.
[BWTIncConstructFromPacked] 210 iterations done. 2100000000 characters processed.
[BWTIncConstructFromPacked] 220 iterations done. 2200000000 characters processed.
[BWTIncConstructFromPacked] 230 iterations done. 2300000000 characters processed.
[BWTIncConstructFromPacked] 240 iterations done. 2400000000 characters processed.
[BWTIncConstructFromPacked] 250 iterations done. 2500000000 characters processed.
[BWTIncConstructFromPacked] 260 iterations done. 2600000000 characters processed.
[BWTIncConstructFromPacked] 270 iterations done. 2700000000 characters processed.
[BWTIncConstructFromPacked] 280 iterations done. 2800000000 characters processed.
[BWTIncConstructFromPacked] 290 iterations done. 2900000000 characters processed.
[BWTIncConstructFromPacked] 300 iterations done. 3000000000 characters processed.
[BWTIncConstructFromPacked] 310 iterations done. 3100000000 characters processed.
[BWTIncConstructFromPacked] 320 iterations done. 3200000000 characters processed.
[BWTIncConstructFromPacked] 330 iterations done. 3300000000 characters processed.
[BWTIncConstructFromPacked] 340 iterations done. 3400000000 characters processed.
[BWTIncConstructFromPacked] 350 iterations done. 3500000000 characters processed.
[BWTIncConstructFromPacked] 360 iterations done. 3600000000 characters processed.
[BWTIncConstructFromPacked] 370 iterations done. 3700000000 characters processed.
[BWTIncConstructFromPacked] 380 iterations done. 3800000000 characters processed.
[BWTIncConstructFromPacked] 390 iterations done. 3900000000 characters processed.
[BWTIncConstructFromPacked] 400 iterations done. 4000000000 characters processed.
[BWTIncConstructFromPacked] 410 iterations done. 4100000000 characters processed.
[BWTIncConstructFromPacked] 420 iterations done. 4200000000 characters processed.
[BWTIncConstructFromPacked] 430 iterations done. 4300000000 characters processed.
[BWTIncConstructFromPacked] 440 iterations done. 4400000000 characters processed.
[BWTIncConstructFromPacked] 450 iterations done. 4500000000 characters processed.
[BWTIncConstructFromPacked] 460 iterations done. 4600000000 characters processed.
[BWTIncConstructFromPacked] 470 iterations done. 4700000000 characters processed.
[BWTIncConstructFromPacked] 480 iterations done. 4800000000 characters processed.
[BWTIncConstructFromPacked] 490 iterations done. 4900000000 characters processed.
[BWTIncConstructFromPacked] 500 iterations done. 5000000000 characters processed.
[BWTIncConstructFromPacked] 510 iterations done. 5100000000 characters processed.
[BWTIncConstructFromPacked] 520 iterations done. 5200000000 characters processed.
[BWTIncConstructFromPacked] 530 iterations done. 5300000000 characters processed.
[BWTIncConstructFromPacked] 540 iterations done. 5400000000 characters processed.
[BWTIncConstructFromPacked] 550 iterations done. 5500000000 characters processed.
[BWTIncConstructFromPacked] 560 iterations done. 5600000000 characters processed.
[BWTIncConstructFromPacked] 570 iterations done. 5700000000 characters processed.
[BWTIncConstructFromPacked] 580 iterations done. 5798188880 characters processed.
[BWTIncConstructFromPacked] 590 iterations done. 5886472096 characters processed.
[BWTIncConstructFromPacked] 600 iterations done. 5964934432 characters processed.
[BWTIncConstructFromPacked] 610 iterations done. 6034667936 characters processed.
[BWTIncConstructFromPacked] 620 iterations done. 6096643264 characters processed.
[BWTIncConstructFromPacked] 630 iterations done. 6151723072 characters processed.
[BWTIncConstructFromPacked] 640 iterations done. 6200674128 characters processed.
[BWTIncConstructFromPacked] 650 iterations done. 6244177920 characters processed.
[BWTIncConstructFromPacked] 660 iterations done. 6282840176 characters processed.
[BWTIncConstructFromPacked] 670 iterations done. 6317199264 characters processed.
[BWTIncConstructFromPacked] 680 iterations done. 6347733664 characters processed.
[BWTIncConstructFromPacked] 690 iterations done. 6374868704 characters processed.
[BWTIncConstructFromPacked] 700 iterations done. 6398982368 characters processed.
[BWTIncConstructFromPacked] 710 iterations done. 6418915856 characters processed.
[bwt_gen] Finished constructing BWT in 710 iterations.
[bwa_index] 3649.78 seconds elapse.
[bwa_index] Update BWT... 23.62 sec
[bwa_index] Pack forward-only FASTA... 21.46 sec
[bwa_index] Construct SA from BWT and Occ... 1015.61 sec
[main] Version: 0.7.12-r1039
[main] CMD: bwa index GCA_000001405.15_GRCh38_full_analysis_set.fna
[main] Real time: 4891.025 sec; CPU: 4743.604 sec
hadoop@Mcnode5:~/disk2/home/hadoop/xubo/ref/buildIndex$ ls
GCA_000001405.15_GRCh38_full_analysis_set.fna      GCA_000001405.15_GRCh38_full_analysis_set.fna.ann  GCA_000001405.15_GRCh38_full_analysis_set.fna.pac
GCA_000001405.15_GRCh38_full_analysis_set.fna.amb  GCA_000001405.15_GRCh38_full_analysis_set.fna.bwt  GCA_000001405.15_GRCh38_full_analysis_set.fna.sa

内存使用:

16:39:13  memtot memfree buffers   cached  slabmem      swptot swpfree  _mem_
16:39:13  14023M    713M    231M    8247M     571M       6133M   6133M
16:39:14  14023M    711M    231M    8247M     571M       6133M   6133M
16:39:15  14023M    711M    231M    8247M     571M       6133M   6133M
16:39:16  14023M    666M    231M    8247M     571M       6133M   6133M
16:39:17  14023M    372M    231M    8247M     571M       6133M   6133M
***
17:31:09  14023M    358M     89M    5002M     424M       6133M   6089M
17:31:10  14023M    173M     89M    5182M     428M       6133M   6089M
17:31:11  14023M    171M     89M    5186M     425M       6133M   6089M
17:31:12  14023M    154M     89M    5205M     424M       6133M   6089M
17:31:13  14023M    154M     89M    5203M     425M       6133M   6089M
17:31:14  14023M    154M     89M    5204M     425M       6133M   6089M
17:31:15  14023M    154M     89M    5204M     425M       6133M   6089M
17:31:16  14023M    154M     89M    5204M     425M       6133M   6089M
17:31:17  14023M    170M     89M    5188M     425M       6133M   6089M
17:31:18  14023M    154M     89M    5204M     425M       6133M   6089M
17:31:19  14023M    154M     89M    5204M     425M       6133M   6089M
17:31:20  14023M    155M     89M    5204M     425M       6133M   6089M
17:31:21  14023M    176M     89M    5182M     425M       6133M   6089M
17:31:22  14023M    172M     89M    5182M     425M       6133M   6089M
17:31:23  14023M    172M     89M    5182M     425M       6133M   6089M
17:31:24  14023M    172M     89M    5182M     424M       6133M   6089M
17:31:25  14023M   1081M     89M    5182M     424M       6133M   6089M
17:31:26  14023M   4776M     89M    5182M     424M       6133M   6089M
17:31:27  14023M   4767M     89M    5182M     424M       6133M   6089M
17:31:28  14023M   4768M     89M    5182M     424M       6133M   6089M
17:31:29  14023M   4768M     89M    5182M     424M       6133M   6089M
17:31:30  14023M   4768M     89M    5182M     424M       6133M   6089M
17:31:31  14023M   4768M     89M    5182M     424M       6133M   6089M
17:31:32  14023M   4768M     89M    5182M     424M       6133M   6089M
17:31:33  14023M   4768M     89M    5182M     424M       6133M   6089M

参考

【1】https://github.com/xubo245/AdamLearning
【2】https://github.com/bigdatagenomics/adam/ 
【3】https://github.com/xubo245/SparkLearning
【4】http://spark.apache.org

研究成果:

【1】 [BIBM] Bo Xu, Changlong Li, Hang Zhuang, Jiali Wang, Qingfeng Wang, Chao Wang, and Xuehai Zhou, "Distributed Gene Clinical Decision Support System Based on Cloud Computing", in IEEE International Conference on Bioinformatics and Biomedicine. (BIBM 2017, CCF B)
【2】 [IEEE CLOUD] Bo Xu, Changlong Li, Hang Zhuang, Jiali Wang, Qingfeng Wang, Xuehai Zhou. Efficient Distributed Smith-Waterman Algorithm Based on Apache Spark (CLOUD 2017, CCF-C).
【3】 [CCGrid] Bo Xu, Changlong Li, Hang Zhuang, Jiali Wang, Qingfeng Wang, Jinhong Zhou, Xuehai Zhou. DSA: Scalable Distributed Sequence Alignment System Using SIMD Instructions. (CCGrid 2017, CCF-C).
【4】more: https://github.com/xubo245/Publications

Help

If you have any questions or suggestions, please write it in the issue of this project or send an e-mail to me: xubo245@mail.ustc.edu.cn
Wechat: xu601450868
QQ: 601450868
©️2020 CSDN 皮肤主题: 大白 设计师:CSDN官方博客 返回首页