public abstract class CommonCrawler extends Crawler
构造器和说明 |
---|
CommonCrawler() |
限定符和类型 | 方法和说明 |
---|---|
Fetcher |
createFetcher()
生成Fetcher(抓取器)的方法,可以通过Override这个方法来完成自定义Fetcher
|
Parser |
createParser(String url,
String contentType)
根据网页的url和contentType,来创建Parser(解析器),可以通过Override这个方法来自定义Parser
|
Request |
createRequest(String url)
根据url生成Request(http请求)的方法,可以通过Override这个方法来自定义Request
|
ConnectionConfig |
getConconfig()
返回http连接配置对象
|
String |
getCookie()
返回Cookie
|
boolean |
getIsContentStored()
返回是否存储网页/文件的内容
|
Proxy |
getProxy()
返回代理
|
String |
getUseragent()
返回User-Agent
|
void |
setConconfig(ConnectionConfig conconfig)
设置http连接配置对象
|
void |
setCookie(String cookie)
设置http请求的cookie
|
void |
setIsContentStored(boolean isContentStored)
设置是否存储网页/文件的内容
|
void |
setProxy(Proxy proxy)
设置代理
|
void |
setUseragent(String useragent)
设置User-Agent
|
addRegex, addSeed, createFetcherHandler, createGenerator, createInjector, failed, getRegexs, getSeeds, getThreads, inject, isResumable, setRegexs, setResumable, setSeeds, setThreads, start, stop, updateFetcher, visit
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
createDbUpdater
public Request createRequest(String url) throws Exception
url
- Exception
public Parser createParser(String url, String contentType) throws Exception
url
- contentType
- Exception
public Fetcher createFetcher()
Crawler
createFetcher
在类中 Crawler
public String getUseragent()
public void setUseragent(String useragent)
useragent
- public ConnectionConfig getConconfig()
public void setConconfig(ConnectionConfig conconfig)
conconfig
- http连接配置对象public boolean getIsContentStored()
public void setIsContentStored(boolean isContentStored)
isContentStored
- 是否存储网页/文件的内容public Proxy getProxy()
public void setProxy(Proxy proxy)
proxy
- 代理public String getCookie()
public void setCookie(String cookie)
cookie
- CookieCopyright © 2014. All Rights Reserved.