TY - JOUR T1 - Multidimensional analysis model for a document warehouse that includes textual measures JO - Decision Support Systems VL - 72 IS - 0 SP - 44 EP - 59 PY - 2015/4// T2 - AU - Mendoza, Martha AU - Alegría, Erwin AU - Maca, Manuel AU - Cobos, Carlos AU - León, Elizabeth SN - 0167-9236 DO - http://dx.doi.org/10.1016/j.dss.2015.02.008 UR - http://www.sciencedirect.com/science/article/pii/S0167923615000287 KW - Document warehouse KW - OLAP KW - Textual measures KW - Text warehouse AB - Abstract Data warehouses and On-Line Analytical Processing tools, OLAP, together permit a multi-dimensional analysis of structured data information. However, as business systems are increasingly required to handle substantial quantities of unstructured textual information, the need arises for an effective and similar means of analysis. To manage unstructured text data stored in data warehouses, a new multi-dimensional analysis model is proposed that includes textual measures as well as a topic hierarchy. In this model, the textual measures that associate the topics with the text documents are generated by Probabilistic Latent Semantic Analysis, while the hierarchy is created automatically using a clustering algorithm. Documents are then able to be queried using OLAP tools. The model was evaluated from two viewpoints — query execution time and user satisfaction. Evaluation of execution time was carried out on scientific articles using two query types and user satisfaction (with query time and ease of use) using statistical frequency and multivariate analyses. Encouraging observations included that as the number of documents increases, query time increases as a lineal, rather than exponential tendency. In addition, the model gained an increasing acceptance with use, while the visualization of the model was also well received by users. ER -