Electronic medical records contain text composed by hospital employees; this text often describes medical and socio-economic information that appears nowhere else in the electronic medical record. This data has historically been ignored by data analysts, as unstructured text is uniquely challenging to analyze: phrasing differs across authors and misspellings and punctuation errors are frequent.
The advent of GPU computing and new research in machine learning has given us new tools to improve healthcare through analysis of this text. We use these tools to predict and prevent unplanned hospital readmissions; when a patient is readmitted to the hospital, they suffer the emotional and physical stress of a prolonged health problem, and the hospital takes a financial penalty imposed by the Centers for Medicare & Medicaid Services.
By using a combination of Word2Vec (developed by Google) and a convolutional neural network on our data, we were able to develop a model that predicts 30-day readmission more precisely than other well-established models.