storage-indexing
The storage-indexing components of SocialSensor include the following:
- mongoDB: This stores the metadata of Items, MediaItems and WebPages, as well as auxiliary data, such as the Twitter accounts to monitor, and the URLs to fetch.
- Solr: This hosts the Items after having them populated with their metadata as well as the DySCOs in two separate fully searchable collections. Also, in a separate collection, it stores MediaItems associated with a set of properties in order to be searchable by a full text search query. Finally, for the purposes of the n-gram analysis, which is involved in the Dysco Creation, the TopicDetectionItems collection has been created. The latter is not a permanent storage but a temporary repository for the processed Items of each timeslot.
- mm-index: This index is dedicated to the indexing of image features to enable fast and scalable similarity-based search. The underlying indexing mechanism is documented in D4.2 and thorough evaluation results are available in D4.3. Its source code is also available in the multimedia-indexing GitHub project.
- infotainmentDB: This is a database, dedicated to the infotainment use case, storing the schema and content around an event of interest (e.g. for a film festival: film program, film details, directors, etc.).
Access to those is possible through methods of the socialsensor-framework-client as well as through a REST API (e.g. the infotainment API).