<P> Search engines have become popular tools for indexing and searching through such data, especially text . </P> <P> Specific computational workflows have been developed to impose structure upon the unstructured data contained within text documents . These workflows are generally designed to handle sets of thousands or even millions of documents, or far more than manual approaches to annotation may permit . Several of these approaches are based upon the concept of online analytical processing, or OLAP, and may be supported by data models such as text cubes . Once document metadata is available through a data model, generating summaries of subsets of documents (i.e., cells within a text cube) may be performed with phrase - based approaches . </P> <P> Biomedical research generates one major source of unstructured data as researchers often publish their findings in scholarly journals . Though the language in these documents is challenging to derive structural elements from (e.g., due to the complicated technical vocabulary contained within and the domain knowledge required to fully contextualize observations), the results of these activities may yield links between technical and medical studies and clues regarding new disease therapies . Recent efforts to enforce structure upon biomedical documents include self - organizing map approaches for identifying topics among documents, general - purpose unsupervised algorithms, and an application of the CaseOLAP workflow to determine associations between protein names and cardiovascular disease topics in the literature . CaseOLAP defines phrase - category relationships in an accurate (identifies relationships), consistent (highly reproducible), and efficient manner . This platform offers enhanced accessibility and empowers the biomedical community with phrase - mining tools for widespread biomedical research applications . </P>

An example of a type of unstructured data​ is