Class GeoIPIndexingFilter

  • All Implemented Interfaces:
    Configurable, IndexingFilter, Pluggable

    public class GeoIPIndexingFilter
    extends Object
    implements IndexingFilter
    This plugin implements an indexing filter which takes advantage of the GeoIP2-java API.

    The third party library distribution provides an API for the GeoIP2 Precision web services and databases. The API also works with the free GeoLite2 databases.

    Depending on the service level agreement, you have with the GeoIP service provider, the plugin can add a number of the following fields to the index data model:

    1. Continent
    2. Country
    3. Regional Subdivision
    4. City
    5. Postal Code
    6. Latitude/Longitude
    7. ISP/Organization
    8. AS Number
    9. Confidence Factors
    10. Radius
    11. User Type

    Some of the services are documented at the GeoIP2 Precision Services webpage where more information can be obtained.

    You should also consult the following three properties in nutch-site.xml

      
     <!-- index-geoip plugin properties -->
     <property>
       <name>index.geoip.usage</name>
       <value>insightsService</value>
       <description>
       A string representing the information source to be used for GeoIP information
       association. Either enter 'cityDatabase', 'connectionTypeDatabase', 
       'domainDatabase', 'ispDatabase' or 'insightsService'. If you wish to use any one of the 
       Database options, you should make one of GeoIP2-City.mmdb, GeoIP2-Connection-Type.mmdb, 
       GeoIP2-Domain.mmdb or GeoIP2-ISP.mmdb files respectively available on the Hadoop classpath 
       and available at runtime. This can be achieved by adding it to `$NUTCH_HOME/conf`.
       Alternatively, also the GeoLite2 IP databases (GeoLite2-*.mmdb) can be used.
       </description>
     </property>
     
     <property>
       <name>index.geoip.userid</name>
       <value></value>
       <description>
       The userId associated with the GeoIP2 Precision Services account.
       </description>
     </property>
     
     <property>
       <name>index.geoip.licensekey</name>
       <value></value>
       <description>
       The license key associated with the GeoIP2 Precision Services account.
       </description>
     </property>