Overview
It is estimated there are over 370 million pornographic pages on the internet and thousands more being created every day. The dynamic nature of the internet is a major challenge to web filtering companies endeavoring to maintain an update URL category database.
As human categorisation teams struggle to keep up with the ever changing landscape of pornographic websites, web filtering companies are employing a blend of filtering technologies to accurately classify the internet. Often such heuristic filtering technologies are based purely on lexical analysis, which can be ineffective on pornographic sites with minimal or no text content and foreign language sites. The content of pornographic websites is by their nature primarily image based and where language is regionally specific an image is global.
Hence when deploying technology to categorise these sites accurately it is essential that within the layered system, the image content of a site is analysed and that the results of this analysis are incorporated as a key element within the decision making process.
Image AnalyzerTM WebCategorizer has been specifically developed for filtering vendors as a ‘back office’ categorisation system, reviewing existing lists of categorised sites and analysing previously unseen sites.
How does it work?
The Image AnalyzerTM WebCategorizer analyses batches of URLs to determine if they are pornographic in nature. The application analyses both images and lexical content on the page including any associated hyperlinks to produce a probability result.
The technology has been developed to process data very quickly and can be implemented as a additional component of a ‘back office’ filtering/prioritisation system. Image AnalyserTM WebCategorizer will greatly improve the productivity of your human categorisation team and the accuracy of existing heuristics based categorisation system.
The Image AnalyzerTM WebCategorizer scans all the images on any given URL and categorises the site based on the total number of images deemed to be pornographic. This helps to eliminate false positives and false negatives by looking at the trend across all the images on all the pages of the site.
The product automatically follows links from thumbnails to large higher resolution source images and hyperlinks to sub pages.
Features & Benefits:
Utilise Image Analyzer’s revolutionary pornographic image scanner
- Scan image and text content of web page
- Categorisation based on all page content
- Processes batch file of multiple URL’s
- Adjustable image scanning thresholds and engine settings
- Editable weighted lexical word list
- Definable trigger values for % of images on page and lexical score
- Collects all hyperlinks associated to webpage
- Automatically scans hyperlinks and adds new sites to scanning queue
- Adjustable image size constraints
- Detailed XML output including images results, words found, site IP and hyperlinks.
- Multi threaded URL fetching and scanning
- Definable method of evaluation, processing conditions and results evaluation
- Optional Video Analyzer Module
Downloads:
|