Genome Browser使用方式札记
应该会持续更新,因为目前还搞不懂这些文件的作用,捂脸
安装步骤介绍:
-->1:安装docker
sudo apt-get update
sudo apt-get install docker.io
docker --version
-->2:获取NGB.git
git clone https://github.com/epam/NGB.git
-->3:进行构建
cd NGB
./gradlew buildDocker
-->4:确认一下已经dokcer中已经有了名称为ngb:latest的镜像仓库
docker images ls
-->5:基于已有的ngb:latest镜像运行NGB
docker run -p 8080:8080 -d --name ngbcore -v /home/2tong:/ngs ngb:latest
Tips:其中,ngbcore时可以随意更改的,/home/2tong是可以随意更改的
-->6:打开浏览器,键入http://localhost:8080/catgenome
-->7:注册数据到ngb
docker exec -it ngbcore /bin/bash
ngb reg_ref /ngs/<Path to FASTA> -n my_genome -t
ngb reg_file my_genome /ngs/<Path to File> -n my_file1 -t
ngb reg_dataset my_genome my_sample_dataset my_file1
exit
通过增加my_file2,my_file3等,可以增加这个dataset所包含的数据数目
注册数据的快慢和这个数据的大小还是有关系的~
-->8:选择文件,开始比对
补充:
-->1:ngb常用命令集锦
CLI for NGB server
All objects can be addressed by biologicalDataItemID or by name.
REFERENCE commands:
rr reg_ref : registers a reference file {rr \path\to\file.fa -n grch38}
dr del_ref : unregisters a reference file {dr grch38}
lr list_ref : lists all reference files, registered on the server {lr}
ag add_genes : adds a gene file to the reference {ag grch38 genes.gtf}
an add_ann : adds a annotation file to the reference {an grch38 annotations.gtf}
ran remove_ann : remove a annotation file from the list of reference annotation files {ran grch38 annotations.gtf}
rg remove_genes : removes a gene file from the reference {rg grch38}
FILE commands:
rf reg_file : registers a feature file for a specified reference {rf grch38 \path\to\file.bam?\path\to\file.bam.bai -n my_vcf}
df del_file : deletes a feature file one {df my_vcf}
if index_file : creates a feature index for a file. {if genes.gtf}
SEARCH commands:
s search : finds a reference or feature file by it's name, search can be configured by a '-c' option {s -l vcf}
DATASET commands:
rd reg_dataset : creates a new dataset (ex project) for a specified reference {rd grch38 my_dataset}
add add_dataset : adds files to a dataset {add my_dataset sample.bam sample.vcf}
rmd remove_dataset : removes files from a dataset {rmd my_dataset my_vcf}
dd del_dataset : removes a dataset {dd my_dataset}
md move_dataset : changes the dataset parent to the dataset specified by the "-p" option, if option isn't provided, the dataset will be moved to the top level of the datasets hierarchy {md my_dataset -p parent}
ld list_dataset : lists all datasets, registered on the server {ld}
ADDITIONAL commands:
url : generate url for displaying required files. {url my_dataset}
TOOLS commands:
sort : sorts given feature file. If target path is not specified, sorted file will be stored in the same folder as the original one with the `.sorted.` suffix in the name.
CONFIGURATION commands:
srv set_srv : sets working server url for CLI srv http://{SERVER_IP_OR_NAME}:{SERVER_PORT}/catgenome
v version : prints CLI version to the console standard output
Available options (options may go before, after or between the arguments):
-c (--config, --configuration) PATH : path to the configuration file
-f (--force) : defines if a dataset will be force
deleted (default: false)
-g (--genes) VAL : specifies a gene file for reference
registration
-h (--help) : prints help (default: true)
-j (--json) : output request's result in a json,
otherwise the output of all commands
will be ignored, excluding search and
list commands (default: false)
-l (--like) : use non-strict search for file finding
(default: false)
-loc (--location) VAL : location of view port in format:
chr:start-end
-m (--max_memory) N : specifies amount of memory in megabytes
to use when sorting (default: 500)
(default: 0)
-n (--name) VAL : explicitly specifies file name for
registration
-ngc (--nogccontent) : specifies if GC content shouldn't be
calculated during reference registration
(default: false)
-ni (--no_index) : defines if a feature index should not be
created for registered VCF or GFF/GTF
file (default: false)
-p (--parent) VAL : specifies dataset parent for registration
-pt (--pretty) VAL : pretty name for datasets or biological
data file
-t (--table) : output request's result in a table,
otherwise the output of all commands
will be ignored, excluding search and
list commands (default: false)
-->2:indels的分类
替换:指与参考序列相比,一种碱基被另一种碱基所取代;以符号“>”进行表示;如:c.125A>T,表示与参考序列相比,第125位的A被T所取代;
缺失:指与参考序列相比,一个或多个碱基缺失的现象;以“DEL”进行表示;如:c.2054delA,表示与参考序列相比,第2054位发生A的缺失;
插入:指与参考序列相比,一个或多个碱基增添的现象;以“INS”进行表示;如:c.5750_5751insAGG,表示与参考序列相比,在第5750 与5751位点之间插入了三个碱基AGG;
缺失插入:指与参考序列相比,一个或多个碱基被其他碱基所取代的现象,并且这种变异不包括替换突变、倒置以及转换突变;常以“delins”进行表示,这里以“MIXED”表示;如:c.6776delinsGA,表示与参考序列相比,第6776位缺失了一个碱基,且缺失的碱基被GA做取代;
重复:指与参考序列相比,包含一个或多个碱基的拷贝以插入的形式直接掺入序列中的现象;以“DUP”进行表示;如:c.6_8dupT,表示从第6位到第8位发生了T的重复;
补充,就很蠢,docker需要补一波知识点~~
docker exec -it ngbcore3 ngb reg_ref .....就可以的了
上一篇: 远程连接mysql数据库方法
推荐阅读