All Projects â†’ BGI-shenzhen â†’ PopLDdecay

BGI-shenzhen / PopLDdecay

Licence: MIT License
PopLDdecay: a fast and effective tool for linkage disequilibrium decay analysis based on variant call format(VCF) files

Programming Languages

C++
36643 projects - #6 most used programming language
shell
77523 projects
perl
6916 projects
Makefile
30231 projects
c
50402 projects - #5 most used programming language
M4
1887 projects

Projects that are alternatives of or similar to PopLDdecay

echolocatoR
Automated statistical and functional fine-mapping pipeline with extensive API access to datasets.
Stars: ✭ 13 (-90.37%)
Mutual labels:  linkage-disequilibrium
countriesNowAPI
CountriesNow is an Open source API for retrieving geo-information for countries, including their states, cities, population, etc. 🌎
Stars: ✭ 78 (-42.22%)
Mutual labels:  population
mongoose-schema-jsonschema
Mongoose extension that allows to build json schema for mongoose models, schemes and queries
Stars: ✭ 88 (-34.81%)
Mutual labels:  population
kuwala
Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data sc…
Stars: ✭ 474 (+251.11%)
Mutual labels:  population
adjclust
Adjacency-constrained hierarchical clustering of a similarity matrix
Stars: ✭ 15 (-88.89%)
Mutual labels:  linkage-disequilibrium
genstar
Generation of Synthetic Populations Library
Stars: ✭ 17 (-87.41%)
Mutual labels:  population
VCF2Dis
VCF2Dis: A new simple and efficient software to calculate p-distance matrix based Variant Call Format
Stars: ✭ 54 (-60%)
Mutual labels:  population
decaylanguage
Package to parse decay files, describe and convert particle decays between digital representations.
Stars: ✭ 34 (-74.81%)
Mutual labels:  decay
Genetics
Genetics (Initialization, Selection, Crossover, Mutation)
Stars: ✭ 15 (-88.89%)
Mutual labels:  population
PopED
Population Experimental Design (PopED) in R
Stars: ✭ 27 (-80%)
Mutual labels:  population
covid19 scenarios data
Data preprocessing scripts and preprocessed data storage for COVID-19 Scenarios project
Stars: ✭ 43 (-68.15%)
Mutual labels:  population
TSP-GA
Traveling Salesman Problem Using Parallel Genetic Algorithms
Stars: ✭ 29 (-78.52%)
Mutual labels:  population

PopLDdecay

PopLDdecay: a fast and effective tool for linkage disequilibrium decay analysis based on variant call format files

The PopLDdecay article has been published in Bioinformatics magazine, please cited this article if possible

PMID: 30321304           DOI:10.1093/bioinformatics/bty875

1) Install


Download


Method1 For linux/Unix and macOS
        git clone https://github.com/BGI-shenzhen/PopLDdecay.git 
        cd PopLDdecay; chmod 755 configure; ./configure;
        make;
        mv PopLDdecay  bin/;    #     [rm *.o]

Note: If fail to link,try to re-install the libraries zlib

Method2 For linux/Unix and macOS

        tar -zxvf  PopLDdecayXXX.tar.gz
        cd PopLDdecayXXX;
        cd src;
        make ; make clean                            # or [sh make.sh]
        ../bin/PopLDdecay

Note: If fail to link,try to re-install the libraries zlib

2) Example


see more detailed Usage in the Documentation

    1. Calculate LD decay
      # 1)  For gatk VCF file deal , run PopLDdecay  direct
            ./bin/PopLDdecay    -InVCF  SNP.vcf.gz  -OutStat LDdecay   
      # 2)  For plink [.ped .map], chang plink 2 genotype first  2) run PopLDdecay  
            perl bin/mis/plink2genotype.pl    -inPED in.ped -inMAP in.map  -outGenotype out.genotype ;      ./bin/PopLDdecay        -InGenotype out.genotype -OutStat LDdecay 
      # 3)  To Calculate the subgroup GroupA LDdecay in VCF Files   # put GroupA sample name into GroupA_sample.list
            ./bin/PopLDdecay   -InVCF    -OutStat    -SubPop    GroupA_sample.list
    1. draw the Figure
        #    2.1  For one Population
        perl  bin/Plot_OnePop.pl  -inFile   LDdecay.stat.gz  -output  Fig
        #    2.2  For one Population  muti chr          # List Format [chrResultPathWay]
        perl  bin/Plot_OnePop.pl  -inList   Chr.ResultPath.List  -output Fig
        #    2.3  For muti Population                   #  List Format :[Pop.ResultPath  PopID ]
        perl  bin/Plot_MutiPop.pl  -inList  Pop.ResultPath.list  -output Fig
    1. see the result [LDdecay.stat.gz] and [Fig.png Fig.pdf]

3) Introduction


Linkage disequilibrium (LD) decay[1] is the most important and most common analysis in the population resequencing[2]. Special in the self-pollinated crops, the LD decay may not only reveal much about domestication and breed history[3], but also can reveal gene flow phenomenon, selection regions[1].However, to measure the LD decay, it takes too much resources and time by using currently existent software and tools. The LD decay studies also generate extraordinarily large amounts of data to temporary storage when you using the mainstream software "Haploview"[4], the classical LD processing tools. Effective use and analysis to get the LD decay result remains a difficult task for individual researchers. Here, we introduce PopLDdecay, a simple- efficient software for LD decay analysis, which processes the Variant Call Format (VCF)[5] file to produce the LD decay statistics results and plot the LD decay graphs. PopLDdecay is designed to use compressed data files as input or output to save storage space and it facilitates faster and more computationally efficient than the currently existent softwares. This software makes the LD decay pipeline significantly

  • Parameter description
	Usage: PopLDdecay -InVCF  <in.vcf.gz>  -OutStat <out.stat>

		-InVCF       <str>    Input SNP VCF Format
		-InGenotype  <str>    Input SNP Genotype Format
		-OutStat     <str>    OutPut Stat Dist ~ r^2 File

		-SubPop      <str>    SubGroup SampleList of VCFFile [ALLsample]
		-MaxDist     <int>    Max Distance (kb) between two SNP [300]
		-MAF         <float>  Min minor allele frequency filter [0.005]
		-Het         <float>  Max ratio of het allele filter [0.88]
		-Miss        <float>  Max ratio of miss allele filter [0.25]
		-EHH         <str>    To Run EHH Region decay set StartSite [NA]
		-OutFilterSNP         OutPut the final SNP to calculate
		-OutType     <int>    1: R^2 result 2: R^2 & D' result 3:PairWise LD Out[1]
		                      See the Help for more OutType [1-8] details
		
		-help                 Show more help [hewm2008 v3.41]

4) Results


some LD decay images which I draw in the paper before.

5) Discussing


######################swimming in the sky and flying in the sea #############################

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].