Search in sources :

Example 6 with VSCrawler

use of com.virjar.vscrawler.core.VSCrawler in project vscrawler by virjar.

the class GrabController method grab.

@RequestMapping("/grab")
@ResponseBody
public WebJsonResponse<?> grab(@RequestBody GrabRequest grabRequestBean) {
    try {
        VSCrawler vsCrawler = crawlerManager.get(grabRequestBean.getCrawlerName());
        if (vsCrawler == null) {
            return ReturnUtil.failed("no crawler defined :" + grabRequestBean.getCrawlerName());
        }
        Seed seed = new Seed(JSONObject.toJSONString(grabRequestBean));
        GrabResult crawlResult = vsCrawler.grabSync(seed);
        List<Object> strings = crawlResult.allObjectResult();
        if (strings.size() == 0 && seed.getRetry() > 0) {
            return ReturnUtil.failed("timeOut", ReturnUtil.status_timeout);
        } else {
            return ReturnUtil.success(strings);
        }
    } catch (Exception e) {
        return ReturnUtil.failed(e.getMessage());
    }
}
Also used : VSCrawler(com.virjar.vscrawler.core.VSCrawler) Seed(com.virjar.vscrawler.core.seed.Seed) GrabResult(com.virjar.vscrawler.core.processor.GrabResult) JSONObject(com.alibaba.fastjson.JSONObject) RequestMapping(org.springframework.web.bind.annotation.RequestMapping) ResponseBody(org.springframework.web.bind.annotation.ResponseBody)

Example 7 with VSCrawler

use of com.virjar.vscrawler.core.VSCrawler in project vscrawler by virjar.

the class CrawlerController method start.

@RequestMapping("/startCrawler")
@ResponseBody
public WebJsonResponse<String> start(@RequestParam("crawlerName") String crawlerName) {
    VSCrawler vsCrawler = crawlerManager.get(crawlerName);
    if (vsCrawler == null) {
        return ReturnUtil.failed("not crawler defined");
    }
    vsCrawler.start();
    return ReturnUtil.success("success");
}
Also used : VSCrawler(com.virjar.vscrawler.core.VSCrawler) RequestMapping(org.springframework.web.bind.annotation.RequestMapping) ResponseBody(org.springframework.web.bind.annotation.ResponseBody)

Example 8 with VSCrawler

use of com.virjar.vscrawler.core.VSCrawler in project vscrawler by virjar.

the class ResourceController method reloadResource.

@RequestMapping("/reloadResource")
@ResponseBody
public WebJsonResponse<String> reloadResource(@RequestParam("crawlerName") String appSource, @RequestParam("resourceName") String resourceName) {
    VSCrawler vsCrawler = crawlerManager.get(appSource);
    if (vsCrawler == null) {
        return ReturnUtil.failed("no crawler defined :" + appSource);
    }
    VSCrawlerContext vsCrawlerContext = vsCrawler.getVsCrawlerContext();
    vsCrawlerContext.getResourceManager().reloadResource(resourceName);
    return ReturnUtil.success("success");
}
Also used : VSCrawler(com.virjar.vscrawler.core.VSCrawler) VSCrawlerContext(com.virjar.vscrawler.core.VSCrawlerContext) RequestMapping(org.springframework.web.bind.annotation.RequestMapping) ResponseBody(org.springframework.web.bind.annotation.ResponseBody)

Example 9 with VSCrawler

use of com.virjar.vscrawler.core.VSCrawler in project vscrawler by virjar.

the class ResourceController method reloadAccount.

@RequestMapping("/reloadAccount")
@ResponseBody
public WebJsonResponse<String> reloadAccount(@RequestParam("crawlerName") String appSource) {
    VSCrawler vsCrawler = crawlerManager.get(appSource);
    if (vsCrawler == null) {
        return ReturnUtil.failed("no crawler defined :" + appSource);
    }
    VSCrawlerContext vsCrawlerContext = vsCrawler.getVsCrawlerContext();
    vsCrawlerContext.getResourceManager().reloadResource(vsCrawlerContext.makeUserResourceTag());
    return ReturnUtil.success("success");
}
Also used : VSCrawler(com.virjar.vscrawler.core.VSCrawler) VSCrawlerContext(com.virjar.vscrawler.core.VSCrawlerContext) RequestMapping(org.springframework.web.bind.annotation.RequestMapping) ResponseBody(org.springframework.web.bind.annotation.ResponseBody)

Example 10 with VSCrawler

use of com.virjar.vscrawler.core.VSCrawler in project vscrawler by virjar.

the class VSCrawlerClassLoader method loadCrawler.

/**
 * @param crawlerEntryName 爬虫入口类,应该是com.virjar.vscrawler.web.crawler.CrawlerBuilder的实现类
 * @return 由入口类构造的一个爬虫对象
 * @see CrawlerBuilder
 */
public CrawlerBean loadCrawler(String crawlerEntryName, WebApplicationContext webApplicationContext) throws InstantiationException, IllegalAccessException {
    // check
    try {
        CrawlerBuilder crawlerBuilder = (CrawlerBuilder) loadClass(crawlerEntryName).newInstance();
        if (crawlerBuilder instanceof SpringContextAware) {
            SpringContextAware springContextAware = (SpringContextAware) crawlerBuilder;
            springContextAware.init4SpringContext(webApplicationContext);
        }
        // for spring bean auto injection
        injectDependency(crawlerBuilder, true, webApplicationContext);
        VSCrawler vsCrawler = crawlerBuilder.build();
        return new CrawlerBean(vsCrawler, true, this);
    } catch (ClassNotFoundException e) {
    // this exception will not happen
    }
    return null;
}
Also used : VSCrawler(com.virjar.vscrawler.core.VSCrawler) SpringContextAware(com.virjar.vscrawler.web.api.SpringContextAware) CrawlerBuilder(com.virjar.vscrawler.web.api.CrawlerBuilder) CrawlerBean(com.virjar.vscrawler.web.model.CrawlerBean)

Aggregations

VSCrawler (com.virjar.vscrawler.core.VSCrawler)11 RequestMapping (org.springframework.web.bind.annotation.RequestMapping)5 ResponseBody (org.springframework.web.bind.annotation.ResponseBody)5 GrabResult (com.virjar.vscrawler.core.processor.GrabResult)3 Seed (com.virjar.vscrawler.core.seed.Seed)3 VSCrawlerContext (com.virjar.vscrawler.core.VSCrawlerContext)2 CrawlerSession (com.virjar.vscrawler.core.net.session.CrawlerSession)2 SeedProcessor (com.virjar.vscrawler.core.processor.SeedProcessor)2 CrawlerBuilder (com.virjar.vscrawler.web.api.CrawlerBuilder)2 CrawlerBean (com.virjar.vscrawler.web.model.CrawlerBean)2 JSONObject (com.alibaba.fastjson.JSONObject)1 Function (com.google.common.base.Function)1 SegmentResolver (com.virjar.vscrawler.core.seed.SegmentResolver)1 SpringContextAware (com.virjar.vscrawler.web.api.SpringContextAware)1 File (java.io.File)1 IOException (java.io.IOException)1 JarFile (java.util.jar.JarFile)1 ZipFile (java.util.zip.ZipFile)1 DateTime (org.joda.time.DateTime)1 MultipartFile (org.springframework.web.multipart.MultipartFile)1