in Healthcare & Epidemiology

Prediction of Hotspots for COVID-19 and Influenza for the USA

We used publicly available data to predict infections and deaths for each county in the USA for several weeks into the future. We combined primary data about COVID-19 infections and deaths in conjunction with data on the demographic characteristics of counties, the mobility and masking of the population, and the developing understanding of the virus’ incubation. After dimensional reduction of the time-varying and time-independent features to remove inter-dependencies, a Bayesian time-series methodology was implemented based on time-delayed regressors. A similar process was implemented for influenza at the state level in the USA using multi-year infection data. Hotspots could be efficiently and reliably identified in both cases as zones where morbidity and mortality were expected to increase significantly quicker than the national average. The results were provided to government agencies and health systems to help set policy and allocate resources, while also being published publicly for individuals to evaluate their personal risk.

Modeling the Mortality and Morbidity Risk of COVID-19 and Influenza due to Demographic, Socio-Economic, and Behavioral Factors in the USA

We used county-level demographic, socio-economic, and behavioral data to construct risk factors associated with population, age, household size, income, mobility, and mask usage and constructed a risk score for each county in the USA. Risk scores were computed for cases and deaths and hence morbidity and mortality. Principal components were constructed to eliminate linear dependence of the predictors, and polynomial regression was used to arrive at the risk-scoring equations. The process was invertible to explain the leasing cause of high risk in a particular county. The process was repeated for Influenza to compute state-level risk scores for each year since 2013. The results were provided to government agencies and health systems to monitor the spread of the epidemic and plan vaccination drives. The information was also published in a portal for public awareness.

Data Warehousing and Analytics for Public Health Insurance

The project is about providing a next generation and state-of-the-art Analytical Ecosystem as required at Dr.NTRVST for addressing the two fold objective of providing descriptive and predictive analytics to the stakeholders at Dr.NTRVST. The proposed Analytical system should help users move from the existing operational reports available in the Dr.NTRVST Transactional system that are static in nature and create a Self-Service Analytical Environment where users can slice and dice the Dr.NTRVST Data and gain insights into the data by applying various Statistical and Data Mining techniques. The Analytical system should seek to automate various activities that are currently done manually while trying to perform analysis on the Dr.NTRVST data and also bring-in the best practices to the Analytical System. The end-result of identifying exceptions and hidden trends in the Dr.NTRVST data should be achieved with minimal manual interventions and the overall decision making and planning process needs to be made more efficient.

In Transport & Logistics

Short-term Demand and Supply Forecasting for Bike Taxis

Bike taxis have become a popular transport option, nowhere more so than in Indonesia where consolidating bike taxis into an app-based platform resulted in the first tech unicorn in the country. However, predicting the fluctuating demand and corresponding in a large city is challenging - especially since the desire for bike rides is strongly connected to the weather in tropical regions. A model to forecast the local demand and supply for the upcoming fifteen minutes was requested, based on current and historic data. The problem was divided into segments: the demand was predicted based on historical information, current trend, and weather information; the supply was predicted based on the historical driver tendencies, the current locations of drivers, and the estimated endpoints of currently active rides. Neural networks and Markov chains were employed to arrive at a model with around 90% forecast accuracy.

In Banking & Insurance

Data Warehousing and Analytics for Public Health Insurance

The project is about providing a next generation and state-of-the-art Analytical Ecosystem as required at Dr.NTRVST for addressing the two fold objective of providing descriptive and predictive analytics to the stakeholders at Dr.NTRVST. The proposed Analytical system should help users move from the existing operational reports available in the Dr.NTRVST Transactional system that are static in nature and create a Self-Service Analytical Environment where users can slice and dice the Dr.NTRVST Data and gain insights into the data by applying various Statistical and Data Mining techniques. The Analytical system should seek to automate various activities that are currently done manually while trying to perform analysis on the Dr.NTRVST data and also bring-in the best practices to the Analytical System. The end-result of identifying exceptions and hidden trends in the Dr.NTRVST data should be achieved with minimal manual interventions and the overall decision making and planning process needs to be made more efficient.

Modeling Loan Default Risk from Subjective Expert Experience

Our client, a microfinance company that disburses home loans to low-income individuals working in the unorganized sector, required a system to rate the risk of such loans. The risk  of default is high because the applicants’ creditworthiness is not documentarily established, and the loan duration is for a considerable period of time. Sales personnel gather a multitude of information, either directly from the applicant or from different sources. This information becomes the primary data source to analyze and establish whether the loan will be approved, based on an estimation of the probability of successful repayment by the applicant. The principles of Aashiyan have decades of experience in underwriting microfinance loans and are able to make considered decisions based on the available data. However such expertise is mostly informal and subjective and cannot be easily translated into policy or automated risk-scoring. Each evaluation is time-consuming and thus makes it expensive for the organization to scale. Additionally, in the absence of a formal mathematical model, new information is harder to incorporate, and the evaluations suffer from a lack of flexibility. Our system formalizes the subjective expertise of the principals and builds a mathematical structure for risk scoring. The model then was integrated into the client’s data collection system and was able to instantly provide risk-scores as the information was entered. Finally, the model is made adaptable to information about loan outcomes and is able to learn from experience.