Logo Search packages:      
Sourcecode: harvestman version File versions  Download package

HarvestMan::rules::harvestManRulesChecker Class Reference

List of all members.


Detailed Description

Class which checks the download rules for urls. These
rules include depth checks, robot.txt rules checks, filter
checks, external server/directory checks, duplicate url
checks, maximum limits check etc. 

Definition at line 26 of file rules.py.


Public Member Functions

def __init__
def add_link
def add_source_link
def add_to_filter
def apply_word_filter
def clean_up
def dump_urls
def get_stats
def is_duplicate_link
def is_external_server_link
def is_under_starting_directory
def make_filters
def violates_basic_rules

Public Attributes

 junkfilter

Private Member Functions

def __apply_depth_check
def __apply_rep
def __apply_server_filter
def __apply_url_filter
def __compare_by_ip
def __compare_by_name
def __compare_domains
def __create_filter
def __ext_directory_check
def __ext_server_check
def __get_base_server
def __increment_ext_directory_count
def __increment_ext_server_count
def __is_external_link
def __is_inbetween
def __make_filter
def __make_not_expr
def __make_priority
def __make_word_filter
def __make_word_regexp
def __parse_word_filter

Private Attributes

 _configobj
 _extdirs
 _extservers
 _filter
 _links
 _rexplist
 _robocache
 _robots
 _sourceurls
 _wordstr

The documentation for this class was generated from the following file:

Generated by  Doxygen 1.6.0   Back to index