Interactive tool for visualization of topic models

Miroslav Smatana, Viktória Martínková, Dominika Maršáleková, Peter Butka

Interactive tool for visualization of topic models

Číslo: 2/2019
Periodikum: Acta Electrotechnica et Informatica
DOI: 10.15546/aeei-2019-0014

Klíčová slova: topic modeling, visualization, data analysis, Latent Dirichlet Allocation

Pro získání musíte mít účet v Citace PRO.

Přečíst po přihlášení

Anotace: Digital data are all around us and occurs in various forms as videos, pictures or texts. Digital documents represent the vast majority of such data. It can be e-news, social media contributions and so on. They can contain useful information, but due to their amount, it is time-consuming to find relevant information for the concrete company or persons. For that reason, there is a need for their automatic analysis. One of the areas which dealt with textual data analysis is topic modeling. It showed us a new way of how to automatically browse, search and summarize data in the organization. Topic modeling can be useful for time-based analysis of crises, elections, news feeds, launching of new products on the market, and other tasks which led to decision support tasks. In this paper, we aim to survey and compare topic modeling methods and propose web application to visualize extracted topics using topic modeling method called Latent Dirichlet Allocation (LDA). The comparison of selected standard topic modeling methods was experimentally tested on two selected textual datasets (20Newsgroup and Reuters) using standard evaluation metric. The proposed web application was implemented to use LDA and can extract topic models from textual documents datasets, visualize them and show their evolution over time.