Topic detection framework
We make available a framework that contains Java implementations of the following topic detection methods:
- Document-pivot topic detection with LSH indexing.
- Graph-based feature-pivot topic detection with the use of the SCAN algorithm.
- Latent Dirichlter Allocation. In fact this is a wrapper around the mallet implementation.
- Soft frequent itemset mining.
- BNgram.
The framework is particularly well-suited to data coming from Twitter (in fact it is directly applicable to the JSON format of the Twitter Streaming API). For instance, one can apply it on the SocialSensor Twitter datasets.
To get started with this framework, first download and extract the zipped distribution, and read the detailed instructions Guidelines.pdf.
This framework was used to conduct the experimental study in [1]. In case you use it in your research, please cite [1]. For any questions or requests, you may have with respect to this software, please contact any of the following co-authors of the paper: Giorgos, Carlos, Akis or Luca.
[1] L. M. Aiello, G. Petkos, C. Martin, D. Corney, S. Papadopoulos, R. Skraba, A. Goker, I. Kompatsiaris, A. Jaimes. Sensing trending topics in Twitter. IEEE Transactions on Multimedia (pre-print), 2013