Today, most enterprises have a Business Intelligence and analytics
teams. They address the time-sensitive, operational needs of the
organization. Also, importantly, business decisions are taken based on
insights discovered from these platforms. Often AI/ML helps project into
the future. Most often, BI platforms and the need for a workforce is
acknowledged and highly valued by CXO team. Data from various
departments including sales and R & D flows into BI platforms via
data warehouses/data lakes / lake houses. However, direct access to
operational DBMS systems is still needed at times. Also, data may need
to flow in reverse ETL from data warehouses to DBMS systems.
The above scheme is mostly accepted as necessary. In reality,
adoption, data lineage, speed of insight generation and subsequent
discovery varies. Often human insight is still ahead of the system.
Here’s an opportunity for improvement.
One of the key additions to the above ecosystem could be Enterprise
Knowledge Graphs. They can address a critical-need for ‘drilling-down’
into the data to arrive at the ‘nugget of gold’. This while feasible in
current scheme, it is dependent on human skill. A skilled CXO might be
able to get to the ‘insight’ with the right ‘SQL’ query (they may or may
not write it though). This is not uncommon.
EKGs have the potential to bring together key ‘identities’ and their
‘relationships’ across organization. People, departments, products,
customers, geography, time, research, language and inter-dependencies.
The ‘operational’ facts can/should continue to come from data
Can an organization achieve benefits of an EKG by leveraging
investments in a ‘Master data management’ system? Yes, partially. In
practice, ‘MDM’ is not brought into BI platforms, its siloed and has
less visibility. ‘Mastering’ data is considered a data engineering act.
Instead, an EKG system would address the organizational needs more
holistically when it’s integrated into BI platforms through GraphQL.
Understanding needed for building an EKG is natural for any
organization’s team. They know this intuitively. Skills and standards
may be evolving. Web3 and subsequent conversations around semantic web
are helping bridge the gaps. Most of these conversations are about
blockchains. A necessary area that needs a focused effort, of its own,
in the very near future. EKGs, though, can be built now and can provide
value right away.
Let us know if EKGs, Semantic Web interest you. Here’s
an open knowledge graph that can help you draw an analogy to your
organizational needs. Write to us. We are happy to
A simple time-series model using spreadsheet alone, aimed to simulate changes in global population including peak and plateau. AI/ML models need training, validation and testing before use. Simple statistical/problem modeling with necessary assumptions/parameters can help you simulate a real world problem faster. This exercise is almost always necessary and best done upfront.
Ping us if you have a data/analytics/AI-amenable problem, we would love to help you realize your business objectives faster and iteratively get you to the right value based RoI state.
Note : Please feel free to copy and playaround or provide feedback as comments
PS : This is a a no-code/low-code problem modeling and simulation exercise. What is an AI-amenable problem ? and why statistical/problem modeling is needed upfront ? here’s a relevant article to read further : https://lnkd.in/e6VJhcr3
Web 3.0 – Decentralized, Semantic, No trust/permissions needed – IPFS, Blockchain.
Semantic Web – fully connected WWW beyond hyperlinks.
Current hyperlinks may break, redirect, subvert (by updated pages). No versioning feasible. Search engines fill this gap only slightly using page ranking and other heuristics/proxies for intention and authenticity
blockchain based non-repudiation with versioning will make a link immutable
Connected data/information/insights/knowledge/wisdom across various dimensions, domains and any observable phenomena.
GraphQL (Will need to be further refined) – for API.
Like in any other domain there are several significant developments in standards in health in the recent years. Several new standards like FHIR have evolved and have seen strong adoption globally.
However health is a domain of domains, there are many areas within health industry that can significantly benefit from strong set of standards. BioInformatics is one such sub domain which could benefit from stronger standards and better adoption.
There are quite a few standards already in BioInformatics. A few standards are listed below :
The last amongst these OpenWDL is being championed by MIT and Broad Institute and has generated significant interest in the BioInformatics community and industry at large.
Below is a brief listing of these standards how support for them stacks up today.
VaidhyaMegha’s BISDLC SaaS offering has out-of-the box support to all 3 paradigms and more. Below is a very quick demonstration on top of the fantastic open source platform BioStar central, that demonstrates in an over simplified manner how these paradigms can be supported with small enhancements to BioStar central.
The Clinical trial data file (clinical trial data.csv) is taken from “ClinicalTrials.gov”. ClinicalTrials.gov is a database of privately and publicly funded clinical studies conducted around the world.
It is a Web-based resource that provides patients, their family members, health care professionals, researchers, and the public with easy access to information on publicly and privately supported clinical studies on a wide range of diseases and conditions.
The Web site is maintained by the National Library of Medicine (NLM) at the National Institutes of Health (NIH).
In the previous article Connecting Clinical Trials to Research Articles , we have seen how to search PubMed database by specifying clinical trial id(s) and retrieve all the relevant journal articles. In this article, let’s learn about the association of symptoms and diseases, and Phenotype-Genotype.
Importance of symptom and disease relationship
Disease is an abnormal condition that negatively affects the functionality of an organism. Symptom is a physical or mental feature which can indicate a condition. The relation between the diseases and their symptoms are important to diagnose any disease. This information is also useful for medical research purposes.
Each article in the PubMed is associated to metadata that includes major topics of the article. By using a perl script with the NCBI E-utilities, we can retrieve PubMed identifiers of any symptom and disease terms.The symptom and disease terms are defined by MeSH. We can find an association between symptoms and diseases by using the PubMed ids.
Program: The below program gives the pubmed identifiers of co-occurrence of symptoms and diseases.
Input file for Diseases:
Input file for Symptoms:
Output file: The output file contains list of pubmed identifiers of co-occurrence of diseases and symptoms.
Phenotype-Genotype Once the association between diseases and symptoms are identified, we can find the phenotype and genotype information based on symptoms. Let’s take a look at “Phenotype-Genotype” integrator.
Phenotype is the composite of the organism’s observable characteristics.
Genotypeis the part of the genetic makeup of a cell which determines one of its characteristics.
Phegeni is a web interface that integrates various genomic databases with genome wide association study (GWAS).
The genomic databases are from National Center for Biotechnology Information (NCBI) and the association data from National Human Genome Research Institute (NHGRI). Here, the phenotype terms are MESH terms .
The GWAS is a study of a genome-wide set of genetic variants in different individuals to observe if any variant is associated with phenotype/trait.
Clinicians and epidemiologists are interested with the results of GWAS because it helps to study design considerations and generation of biological hypotheses.
GWAS consists of various results that is SNP rs,Gene ,Gene ID,Gene2,Gene ID2,Chromosome and Pubmed ids.
Phegeni Association results file:
Downloading all associate results at PheGeni browser and sample file looks as below.