All Downloads are FREE. Search and download functionalities are using the official Maven repository.

cn.wanghaomiao.seimi.core.SeimiCrawler Maven / Gradle / Ivy

Go to download

一个支持分布式的可以高效开发且可以高效运行的爬虫框架。设计思想上融合了spring与scrapy的优点。

There is a newer version: 2.1.4
Show newest version
package cn.wanghaomiao.seimi.core;

import cn.wanghaomiao.seimi.struct.Response;
import org.apache.http.client.CookieStore;

/**
 * @author 汪浩淼 [[email protected]]
 *         Date: 2015/5/28.
 */
public interface SeimiCrawler {

    public String getUserAgent();

    /**
     * 如果开启cookies通过此方法获取cookiesStore
     * @return CookieStore
     */
    public CookieStore getCookieStore();
    /**
     * 设置起始url
     * @return
     */
    public String[] startUrls();

    /**
     * 针对startUrl生成首批的response回调这个初始接口
     * @param response
     * @return
     */
    public void start(Response response);
}




© 2015 - 2024 Weber Informatics LLC | Privacy Policy