标签:http java io for ar new ad sp
本章描述:对于Protocol的封装
package com.digitalpebble.storm.crawler.fetcher;
import com.digitalpebble.storm.crawler.util.Configuration;
public interface Protocol {
public ProtocolResponse getProtocolOutput(String url) throws Exception;
public void configure(Configuration conf);
}
对于ProtoclFactory的封装
package com.digitalpebble.storm.crawler.fetcher;
import java.net.URL;
import java.util.WeakHashMap;
import com.digitalpebble.storm.crawler.fetcher.asynchttpclient.AHProtocol;
import com.digitalpebble.storm.crawler.util.Configuration;
/**
* @author Yin Shuai
*
*/
public class ProtocolFactory {
private final Configuration config;
private final WeakHashMap<String, Protocol> cache = new WeakHashMap<String, Protocol>();
public ProtocolFactory(Configuration conf) {
config = conf;
}
/** Returns an instance of the protocol to use for a given URL **/
public synchronized Protocol getProtocol(URL url) {
// get the protocol
String protocol = url.getProtocol();
Protocol pp = cache.get(protocol);
if (pp != null)
return pp;
// yuk! hardcoded for now
pp = new AHProtocol();
pp.configure(config);
cache.put(protocol,pp);
return pp;
}
}
对于ProtocolResponse的封装
package com.digitalpebble.storm.crawler.fetcher;
import java.util.HashMap;
public class ProtocolResponse {
final byte[] content;
final int statusCode;
final HashMap<String, String[]> metadata;
public ProtocolResponse(byte[] c, int s, HashMap<String, String[]> md){
content = c;
statusCode = s;
metadata = md;
}
public byte[] getContent() {
return content;
}
public int getStatusCode() {
return statusCode;
}
public HashMap<String, String[]> getMetadata() {
return metadata;
}
}
Storm【实践系列-如何写一个爬虫- 对于Protocol进行的封装】,布布扣,bubuko.com
Storm【实践系列-如何写一个爬虫- 对于Protocol进行的封装】
标签:http java io for ar new ad sp
原文地址:http://my.oschina.net/u/1791874/blog/305263