Scalable and Efficient Probabilistic Topic Model Inference for Textual Data

Scalable and Efficient Probabilistic Topic Model Inference for Textual Data
Author :
Publisher : Linköping University Electronic Press
Total Pages : 53
Release :
ISBN-10 : 9789176852880
ISBN-13 : 9176852881
Rating : 4/5 (881 Downloads)

Book Synopsis Scalable and Efficient Probabilistic Topic Model Inference for Textual Data by : Måns Magnusson

Download or read book Scalable and Efficient Probabilistic Topic Model Inference for Textual Data written by Måns Magnusson and published by Linköping University Electronic Press. This book was released on 2018-04-27 with total page 53 pages. Available in PDF, EPUB and Kindle. Book excerpt: Probabilistic topic models have proven to be an extremely versatile class of mixed-membership models for discovering the thematic structure of text collections. There are many possible applications, covering a broad range of areas of study: technology, natural science, social science and the humanities. In this thesis, a new efficient parallel Markov Chain Monte Carlo inference algorithm is proposed for Bayesian inference in large topic models. The proposed methods scale well with the corpus size and can be used for other probabilistic topic models and other natural language processing applications. The proposed methods are fast, efficient, scalable, and will converge to the true posterior distribution. In addition, in this thesis a supervised topic model for high-dimensional text classification is also proposed, with emphasis on interpretable document prediction using the horseshoe shrinkage prior in supervised topic models. Finally, we develop a model and inference algorithm that can model agenda and framing of political speeches over time with a priori defined topics. We apply the approach to analyze the evolution of immigration discourse in the Swedish parliament by combining theory from political science and communication science with a probabilistic topic model. Probabilistiska ämnesmodeller (topic models) är en mångsidig klass av modeller för att estimera ämnessammansättningar i större corpusar. Applikationer finns i ett flertal vetenskapsområden som teknik, naturvetenskap, samhällsvetenskap och humaniora. I denna avhandling föreslås nya effektiva och parallella Markov Chain Monte Carlo algoritmer för Bayesianska ämnesmodeller. De föreslagna metoderna skalar väl med storleken på corpuset och kan användas för flera olika ämnesmodeller och liknande modeller inom språkteknologi. De föreslagna metoderna är snabba, effektiva, skalbara och konvergerar till den sanna posteriorfördelningen. Dessutom föreslås en ämnesmodell för högdimensionell textklassificering, med tonvikt på tolkningsbar dokumentklassificering genom att använda en kraftigt regulariserande priorifördelningar. Slutligen utvecklas en ämnesmodell för att analyzera "agenda" och "framing" för ett förutbestämt ämne. Med denna metod analyserar vi invandringsdiskursen i Sveriges Riksdag över tid, genom att kombinera teori från statsvetenskap, kommunikationsvetenskap och probabilistiska ämnesmodeller.


Scalable and Efficient Probabilistic Topic Model Inference for Textual Data Related Books

Scalable and Efficient Probabilistic Topic Model Inference for Textual Data
Language: en
Pages: 53
Authors: Måns Magnusson
Categories:
Type: BOOK - Published: 2018-04-27 - Publisher: Linköping University Electronic Press

GET EBOOK

Probabilistic topic models have proven to be an extremely versatile class of mixed-membership models for discovering the thematic structure of text collections.
Scalable and Efficient Probabilistic Topic Model Inference for Textual Data
Language: en
Pages:
Authors: Måns Magnusson
Categories:
Type: BOOK - Published: 2018 - Publisher:

GET EBOOK

Probabilistic topic models have proven to be an extremely versatile class of mixed-membership models for discovering the thematic structure of text collections.
Machine Learning-Based Bug Handling in Large-Scale Software Development
Language: en
Pages: 120
Authors: Leif Jonsson
Categories:
Type: BOOK - Published: 2018-05-17 - Publisher: Linköping University Electronic Press

GET EBOOK

This thesis investigates the possibilities of automating parts of the bug handling process in large-scale software development organizations. The bug handling p
Distributed Moving Base Driving Simulators
Language: en
Pages: 42
Authors: Anders Andersson
Categories:
Type: BOOK - Published: 2019-04-30 - Publisher: Linköping University Electronic Press

GET EBOOK

Development of new functionality and smart systems for different types of vehicles is accelerating with the advent of new emerging technologies such as connecte
Robust Stream Reasoning Under Uncertainty
Language: en
Pages: 234
Authors: Daniel de Leng
Categories:
Type: BOOK - Published: 2019-11-08 - Publisher: Linköping University Electronic Press

GET EBOOK

Vast amounts of data are continually being generated by a wide variety of data producers. This data ranges from quantitative sensor observations produced by rob