Monthly Archives: December 2019

Animated Bar Chart Race For Clinical Trial Data

Github :

What is a Bar Chart Race:

The bar chart race is an animated chart and it displays top “n” values as per year or any category.

The chart consists of four parts. From bottom to top in z-order: the bars, the x-axis, the labels, and the ticker showing the current date. 

It has below features. 

  • Make the animation faster or slower by adjusting the duration in milliseconds
  • Selects top “n” values for displaying bar chart race
  • Good visualization with different colors for each value
  • Replay button

Reference Link:

Clinical Trial Registry Data :

The Clinical trial data file (clinical trial data.csv) is taken from “”. is a database of privately and publicly funded clinical studies conducted around the world. 

It is a Web-based resource that provides patients, their family members, health care professionals, researchers, and the public with easy access to information on publicly and privately supported clinical studies on a wide range of diseases and conditions. 

The Web site is maintained by the National Library of Medicine (NLM) at the National Institutes of Health (NIH).

Steps for implementation of this graph:

  1. Open the link and click on “three dots icon (…)”  (displayed next to “Teams” at top right). Then, click on “Download code ”. The file “bar-chart-race-explained.tgz” will be downloaded. Now, extract the tgz file. It should contain .html , .js , .css , .json files as shown below:
  2. The file e9e3929cf7c50b45@3007.js  contains javascript code as shown in below image :

Open the e9e3929cf7c50b45@3007.js file using any editor (say VS code). Replace the file name “category-brands.csv” with our data file (CSV file) name.                   

3. Download our data file (CSV format).

4. If you want to add some extra features to bar chart(selecting top n values,adjusting duration time in text box etc.) then you should include extra code in the file e9e3929cf7c50b45@3007.js .

 5. Finally, open a terminal and run the below command to get into our project folder path. Ex: cd enter your project folder path(/home/desktop/bar chart race explained)

6. Run the command “simplehttpserver .” Then local server will start work like below:

Listening web root dir /home/Desktop/Bar Graph/bar-chart-race-explained

7. Open the browser and run the localhost:8000.Then animated bar chart will be generated as shown below

8. Duration time of animation can be adjusted

Connecting symptoms, diseases and genotype

In the previous article Connecting Clinical Trials to Research Articles , we have seen how to search PubMed database by specifying clinical trial id(s) and retrieve all the relevant journal articles. In this article, let’s learn about the association of symptoms and diseases, and Phenotype-Genotype.

Importance of symptom and disease relationship

Disease is an abnormal condition that negatively affects the functionality of an organism. Symptom is a physical or mental feature which can indicate a condition. The relation between the diseases and their symptoms are important to diagnose any disease. This information is also useful for medical research purposes. 

Each article in the PubMed is associated to metadata that includes major topics of the article. By using a perl script with the NCBI E-utilities, we can retrieve PubMed identifiers of any symptom and disease terms.The symptom and disease terms are defined by MeSH. We can find an association between symptoms and diseases by using the PubMed ids.

Program: The below program gives the pubmed identifiers of co-occurrence of symptoms and diseases.

Input file for Diseases:

Input file for Symptoms:

Output file: The output file contains list of pubmed identifiers of co-occurrence of diseases and symptoms.

Once the association between diseases and symptoms are identified, we can find the phenotype and genotype information based on symptoms. Let’s take a look at “Phenotype-Genotype” integrator. 

Phenotype is the composite of the organism’s observable characteristics. 

Genotype is the part of the genetic makeup of a cell which determines one of its characteristics.

What is Phegeni ?

Phegeni is a web interface that integrates various genomic databases with genome wide association study (GWAS). 

The genomic databases are from National Center for Biotechnology Information (NCBI) and the association data from National Human Genome Research Institute (NHGRI). Here, the phenotype terms are MESH terms .

  • The GWAS is a study of a genome-wide set of genetic variants in different individuals to observe if any variant is associated with phenotype/trait.
  • Clinicians and epidemiologists are interested with the results of GWAS because it helps to study design considerations and generation of biological hypotheses.
  • GWAS consists of  various results that is SNP rs,Gene ,Gene ID,Gene2,Gene ID2,Chromosome and Pubmed ids.

Phegeni Association results file:

Downloading all associate results at PheGeni browser and sample file looks as below.

The Association results can be accessed from here –

Program: The below program gives the list of SNP rs,Gene ,Gene ID,Gene2,Gene ID2,Chromosome and Pubmed ids of respective phenotype term.

Input file contains a list of phenotype search terms based on MESH and the sample file looks as below.

Output file: Output file contains list of SNP rs,Gene ,Gene ID,Gene2,Gene ID2,Chromosome and Pubmed ids of respective phenotype term.

In this way,we can retrieve genetic variants related to any Phenotype(s).


All the files (input, script and results) of symptom disease relationship,we have used in the above example are available on GitHub and can be downloadable from

All the files (input, script and results) of phegeni,we have used in the above example are available on GitHub and can be downloadable from

Connecting clinical trials to research articles

We can search PubMed database by specifying clinical trial id(s) and retrieve all the relevant journal articles. The NLM (The world’s largest medical library, the U.S. National Library of Medicine is part of the National Institutes of Health) extract those trail IDs and placed them into PubMed secondary id field.

How to  query single trial id??

Example: To search for journal articles related to a clinical trial id say NCT00000419, use PubMed’s application protocol interface (API) called e-Utils, which can be accessed through URL

Now, specify clinical trial id in the eutils api as shown below:[si]

All the journal articles related to the clinical trial id will be displayed as shown below:


In the above url, query string [SI] refers to Secondary ID which can be used to search for articles. This field contains accession numbers of various databases (molecular sequence data, gene expression or chemical compounds etc.)

How to  query multiple trial id’s ?

Here,we have taken a large number of clinical trial numbers in one file and the results were taken into another file which contains pubmed articles of respective trials.

Input file contains multiple NCT numbers which can be used as a query in PubMed API(e-utility) search field. Sample input file looks as below:

Trial ids

Below Perl language script will search for each clinical trial id specified in the input file against PubMed database using PubMed(API) E-Utils.

Output file: Output file contains PMIDs (pubmed records) of respective clinical trials.

Trials connected to research articles

In this way, we can retrieve journal articles related to any clinical trial(s) by using PubMed(API) E-Utils.  

All the files (input, script and results) we have used in the above example are available on GitHub and can be downloadable from