Projects
Prediction of Hotspots for COVID-19 and Influenza for the USA
We used publicly available data to predict infections and deaths for each county in the USA for several weeks into the future. We combined primary data about COVID-19 infections and deaths in conjunction with data on the demographic characteristics of counties, the mobility and masking of the population, and the developing understanding of the virus’ incubation. After dimensional reduction of the time-varying and time-independent features to remove inter-dependencies, a Bayesian time-series methodology was implemented based on time-delayed regressors. A similar process was implemented for influenza at the state level in the USA using multi-year infection data. Hotspots could be efficiently and reliably identified in both cases as zones where morbidity and mortality were expected to increase significantly quicker than the national average. The results were provided to government agencies and health systems to help set policy and allocate resources, while also being published publicly for individuals to evaluate their personal risk.
Modeling the Mortality and Morbidity Risk of COVID-19 and Influenza due to Demographic, Socio-Economic, and Behavioral Factors in the USA
We used county-level demographic, socio-economic, and behavioral data to construct risk factors associated with population, age, household size, income, mobility, and mask usage and constructed a risk score for each county in the USA. Risk scores were computed for cases and deaths and hence morbidity and mortality. Principal components were constructed to eliminate linear dependence of the predictors, and polynomial regression was used to arrive at the risk-scoring equations. The process was invertible to explain the leasing cause of high risk in a particular county. The process was repeated for Influenza to compute state-level risk scores for each year since 2013. The results were provided to government agencies and health systems to monitor the spread of the epidemic and plan vaccination drives. The information was also published in a portal for public awareness.
Data Warehousing and Analytics for Public Health Insurance
The project is about providing a next generation and state-of-the-art Analytical Ecosystem as required at Dr.NTRVST for addressing the two fold objective of providing descriptive and predictive analytics to the stakeholders at Dr.NTRVST. The proposed Analytical system should help users move from the existing operational reports available in the Dr.NTRVST Transactional system that are static in nature and create a Self-Service Analytical Environment where users can slice and dice the Dr.NTRVST Data and gain insights into the data by applying various Statistical and Data Mining techniques. The Analytical system should seek to automate various activities that are currently done manually while trying to perform analysis on the Dr.NTRVST data and also bring-in the best practices to the Analytical System. The end-result of identifying exceptions and hidden trends in the Dr.NTRVST data should be achieved with minimal manual interventions and the overall decision making and planning process needs to be made more efficient.
Products
Scalable Platform to Provide Actionable Information for Health Insurance Decisions
We built a product that effectively integrates clinical, claims, psychographic, and physiologic data from a variety of sources into one platform and presents a holistic patient view through an intuitive dashboard that supports physician decision-making and care team action, improving financial, quality, and health outcomes. The product included a highly effective clinical data model and predictive algorithms that exceed the industry norms for predictive accuracy. Traditionally, health plans have used actuarial models that determine risk and future costs at the population level. Our platform used leading-edge machine learning techniques to create a dynamic risk score for each member, which improves its precision and personalization as well as enables health plans to act on the insights to prevent health deterioration and future costs.
Letter of Protection Management System
We created a web-based medical practice management system for linking patients to doctors and attorneys, scheduling medical procedures, and tracking the resulting claims. The Application is optimized for personal injury business processes. It provides a robust system for tracking claims throughout their lifetime which includes receiving new orders, requesting/receiving documents between attorneys and physicians, scheduling patients for their procedures, and generating reports and invoices. The application is designed in a multi-tenant architecture so that it can onboard different companies onto it’s system and process their requests concurrently. The system’s Document system is also designed to be HIPAA compliant for better security. The backend of the application is deployed in Google Cloud Platform using cloud Datastore and cloud storage as its main Database,while it is powered by a Flask server deployed in Google Compute Engine. The frontend of the Application is designed in Vuetify which is hosted in google firebase hosting.
Radiaide: A Platform for Radiological Image Analysis and Screening
Radiaide is a cloud-based platform for fast and reliable medical image analysis. It supports multiple disease models and imaging modalities and can be used to instantly provide artificial intelligence based analysis of those images along. It is intended to be used in cases where imaging can be performed in remote regions with instant screening and subsequent validation by remote radiologists. The platform supports state-of-the-art security and privacy features along with an efficient picture archiving and communication system as well as annotative viewers for different images. A series of single-purpose Neural Network models are available to be trained and used. Radiologists and other stakeholders are presented with the model results which can be corrected by authorized personnel. Currently, the platform supports Chest X-Ray images for Tuberculosis screening.
Research
Constructing Statistical Characteristics of COVID-19 Infection Trees from Contact-Tracing Data
The global impact of the COVID-19 pandemic has highlighted the need for modern data-driven approaches to managing such epidemiological events with minimum socioeconomic impact. To this end, we analyze publicly available contact tracing data from Karnataka, India to rebuild infection trees based on the ancestry of infections revealed in the data. In an attempt to statistically analyze these trees, we find that both the number of infections originating from a person as well as the size of the tree created by a hierarchy of such infections show a remarkably similar characteristic: the tails of these distributions decay slowly and appear to conform to the power-law form. As a consequence of this discovery, mitigation strategies could be designed by identifying and containing super-spreaders along with milder general restrictions.
Heterogeneous Contact Networks in COVID-19 Spreading: The Role of Social Deprivation
We have two main aims. First, we use theories of disease spreading on networks to look at the COVID-19 epidemic on the basis of individual contacts -- these give rise to predictions that are often rather different from the homogeneous mixing approaches usually used. Our second aim is to look at the role of social deprivation, again using networks as our basis, in the spread of this epidemic. We choose the city of Kolkata as a case study, but assert that the insights so obtained are applicable to a wide variety of urban environments that are densely populated and where social inequalities are rampant. Our predictions of hotspots are found to be in good agreement with those currently being identified empirically as containment zones and provide a useful guide for identifying potential areas of concern.