Interactive Faceted Search/Browse

From agINFRA

Jump to: navigation, search

Contents

Interactive Faceted Search/Browse component

This component integrates the local database previously described and the Social Network Visualization component. It provides a search interface with faceted filtering, meta-data parsing and results preview and navigation.

  • Creates a fully customizable HTML component to be included inside one integrated service build on top of the agINFRA infrastructure.
  • Provides different types of filtering and browse capabilities depending on the API calls from the application.

Usage and deployment (if publicly accessible)

The latest (second) version of the Interactive Faceted Search&Browse component was developed and deployed by ESPOL at the agINFRA cloud. The component uses a REST API that can be used by developers to include a customizable search component. Also, ESPOL hosts a mirror copy of what is deployed at agINFRA cloud as a backup copy. Metadata from the RING like Organic.Edunet, GRNET, TRAGLOR, JMA collections and more dataset available for agINFRA-powered components had been imported and it is deployed in a virtual machine at INFN CLEVER cloud.

The component is deployed and functional at the following URL: http://212.189.144.208/aginfra/search/


Example Usage Scenario

This component provides a flexible way to explore large datasets. It helps any kind of users to refine searches in order to find very specific content in very large repositories or combined datasets.


Before agINFRA

agINFRA powered version

In the powered version the Generator Tool will instantiate the component using the user parameters in order to get a personalized faceted search tool. This will cover the implementation of CreateNewPersonalizedFS&B interface which will store in an URL the personalized tool with a provided ID. All of this will be managed by a simple mysql database deployed in the same component’s infrastructure.

APIs

The FS&B component is powered by ElasticSearch engine using Couch Db river (https://github.com/elasticsearch/elasticsearch-river-couchdb/blob/master/README.md) which can track the changes in the database and make the documents available to the ElasticSearch index. Also it makes use of Backbone JS library for the core of the application. The Elasticsearch index is defined like this on the linux shell, assuming Elasticsearch is running on localhost and port 9200:

  1. curl -XPUT 'http://localhost:9200/aginfra_ds/'
  2. curl -XPUT 'http://localhost:9200/_river/aginfra_ds/_meta' -d '{
   "type" : "couchdb",
   "couchdb" : {
       "host" : "localhost",
       "port" : 5984,
       "db" : "aginfra_datasets",
       "filter" : null
   },
   "index" : {
       "index" : "aginfra_ds",
       "type" : "aginfra_ds",
       "bulk_size" : "100",
       "bulk_timeout" : "10ms"
   }}'

The facets the component can work with are: Context, Language, Dataset and format. Below the specification of the component API:


Interface name CreateNewPersonalizedFS&B
Functionality Create new personalized FS&B component. Returns the identifications for the usage of the component.
API access type REST
Return type JSON
URL and parameters (To be developed)

Interface name GetAvailableDatasets (deprecated)
Functionality Get a list of available datasets.
API access type Elasticsearch Search API/POST
Return type JSON
Note Currently, the datasets are now facets, so the GetAvailableFacets interface should be used to get all available datasets.

Interface name GetAvailableFacets
Functionality Get a list of available facets ID according the repository.
API access type Elasticsearch Search API/POST
Return type JSON
URL and parameters http://212.189.144.208/aginfra/search/endpoint.php

query: JSON data for querying the Elasticsearch index (ej: Bring all facets and results (limited 10): {"size":10,"from":0,"query":{"filtered":{"query":{"query_string":{"fields":["aginfra_eu.lom_general_title_string_type.value","aginfra_eu.lom_lifecycle_contribute_entity_type","aginfra_eu.lom_general_description_string_type"],"query":"*","default_operator":"OR"}},"filter":{"match_all":{}}}},"facets":{"aginfra_eu.lom_general_language_type.value":{"terms":{"field":"aginfra_eu.lom_general_language_type.value","size":20},"facet_filter":{"match_all":{}}},"aginfra_eu.lom_educational_context_value_type.value":{"terms":{"field":"aginfra_eu.lom_educational_context_value_type.value","size":20},"facet_filter":{"match_all":{}}},"aginfra_eu.lom_technical_format_type.value":{"terms":{"field":"aginfra_eu.lom_technical_format_type.value","size":20},"facet_filter":{"match_all":{}}},"aginfra_eu.dataset.value":{"terms":{"field":"aginfra_eu.dataset.value","size":20},"facet_filter":{"match_all":{}}}}} ) idx: Elasticsearch index defined at CouchDB river (ej: agingra_ds). type: Elasticsearch index type defined at CouchDB river (ej: agingra_ds).


Interface name GetFacets
Functionality Getfacets according the repository and query.
API access type Elasticsearch Search API/POST
Return type JSON
URL and parameters http://212.189.144.208/aginfra/search/endpoint.php

query: JSON data for querying the Elasticsearch index (ej: Filter only with facet “format” with a value of “pdf” (limited 10): {"size":10,"from":0,"query":{"filtered":{"query":{"query_string":{"fields":["aginfra_eu.lom_general_title_string_type.value","aginfra_eu.lom_lifecycle_contribute_entity_type","aginfra_eu.lom_general_description_string_type"],"query":"*","default_operator":"OR"}},"filter":{"and":[{"term":{"aginfra_eu.lom_technical_format_type.value":"pdf"}}]}}},"facets":{"aginfra_eu.lom_general_language_type.value":{"terms":{"field":"aginfra_eu.lom_general_language_type.value","size":20},"facet_filter":{"match_all":{}}},"aginfra_eu.lom_educational_context_value_type.value":{"terms":{"field":"aginfra_eu.lom_educational_context_value_type.value","size":20},"facet_filter":{"match_all":{}}},"aginfra_eu.lom_technical_format_type.value":{"terms":{"field":"aginfra_eu.lom_technical_format_type.value","size":20},"facet_filter":{"match_all":{}}},"aginfra_eu.dataset.value":{"terms":{"field":"aginfra_eu.dataset.value","size":20},"facet_filter":{"match_all":{}}}}} ) idx: Elasticsearch index defined at CouchDB river (ej: agingra_ds). type: Elasticsearch index type defined at CouchDB river (ej: agingra_ds).

Personal tools