Search in sources :

Example 1 with CrawlerBuilder

use of com.virjar.vscrawler.web.api.CrawlerBuilder in project vscrawler by virjar.

the class VSCrawlerManager method loadJarFile.

private CrawlerBean loadJarFile(File jarFile) throws Exception {
    ClassLoader originContextClassLoader = Thread.currentThread().getContextClassLoader();
    try {
        VSCrawlerClassLoader vsCrawlerClassLoader = new VSCrawlerClassLoader(jarFile, originContextClassLoader);
        Thread.currentThread().setContextClassLoader(vsCrawlerClassLoader);
        ClassScanner.SubClassVisitor<CrawlerBuilder> subClassVisitor = new ClassScanner.SubClassVisitor<>(true, CrawlerBuilder.class);
        // prevent scan parent class loader
        ClassScanner.scanJarFile(new JarFile(jarFile), subClassVisitor);
        if (subClassVisitor.getSubClass().size() == 0) {
            return null;
        }
        if (subClassVisitor.getSubClass().size() != 1) {
            log.error("a crawler jar can only create one crawler,but find {} in {},{}, this jar file load while be ignore", subClassVisitor.getSubClass().size(), jarFile.getAbsoluteFile(), StringUtils.join(subClassVisitor.getSubClass(), ","));
            return null;
        }
        return vsCrawlerClassLoader.loadCrawler(subClassVisitor.getSubClass().get(0).getName(), webApplicationContext);
    } finally {
        Thread.currentThread().setContextClassLoader(originContextClassLoader);
    }
}
Also used : ClassScanner(com.virjar.vscrawler.core.util.ClassScanner) VSCrawlerClassLoader(com.virjar.vscrawler.web.crawlerloader.VSCrawlerClassLoader) CrawlerBuilder(com.virjar.vscrawler.web.api.CrawlerBuilder) VSCrawlerClassLoader(com.virjar.vscrawler.web.crawlerloader.VSCrawlerClassLoader) JarFile(java.util.jar.JarFile)

Example 2 with CrawlerBuilder

use of com.virjar.vscrawler.web.api.CrawlerBuilder in project vscrawler by virjar.

the class VSCrawlerClassLoader method loadCrawler.

/**
 * @param crawlerEntryName 爬虫入口类,应该是com.virjar.vscrawler.web.crawler.CrawlerBuilder的实现类
 * @return 由入口类构造的一个爬虫对象
 * @see CrawlerBuilder
 */
public CrawlerBean loadCrawler(String crawlerEntryName, WebApplicationContext webApplicationContext) throws InstantiationException, IllegalAccessException {
    // check
    try {
        CrawlerBuilder crawlerBuilder = (CrawlerBuilder) loadClass(crawlerEntryName).newInstance();
        if (crawlerBuilder instanceof SpringContextAware) {
            SpringContextAware springContextAware = (SpringContextAware) crawlerBuilder;
            springContextAware.init4SpringContext(webApplicationContext);
        }
        // for spring bean auto injection
        injectDependency(crawlerBuilder, true, webApplicationContext);
        VSCrawler vsCrawler = crawlerBuilder.build();
        return new CrawlerBean(vsCrawler, true, this);
    } catch (ClassNotFoundException e) {
    // this exception will not happen
    }
    return null;
}
Also used : VSCrawler(com.virjar.vscrawler.core.VSCrawler) SpringContextAware(com.virjar.vscrawler.web.api.SpringContextAware) CrawlerBuilder(com.virjar.vscrawler.web.api.CrawlerBuilder) CrawlerBean(com.virjar.vscrawler.web.model.CrawlerBean)

Example 3 with CrawlerBuilder

use of com.virjar.vscrawler.web.api.CrawlerBuilder in project vscrawler by virjar.

the class VSCrawlerManager method init.

private synchronized void init() {
    if (hasInit) {
        return;
    }
    // cannot auto inject by spring framework,if there no implementations ,a exception will be throw
    Map<String, CrawlerBuilder> beansOfType = webApplicationContext.getBeansOfType(CrawlerBuilder.class);
    crawlerBuilderList.addAll(beansOfType.values());
    // load system crawler
    for (CrawlerBuilder crawlerBuilder : crawlerBuilderList) {
        VSCrawler vsCrawler = crawlerBuilder.build();
        allCrawler.put(vsCrawler.getVsCrawlerContext().getCrawlerName(), new CrawlerBean(vsCrawler));
    }
    // load jar file
    // find jar file root dir
    File jarDir = new File(calcHotJarDir());
    moveEmbedCrawler(jarDir);
    loadHotJar(jarDir);
    hasInit = true;
}
Also used : VSCrawler(com.virjar.vscrawler.core.VSCrawler) CrawlerBuilder(com.virjar.vscrawler.web.api.CrawlerBuilder) CrawlerBean(com.virjar.vscrawler.web.model.CrawlerBean) JarFile(java.util.jar.JarFile) ZipFile(java.util.zip.ZipFile) MultipartFile(org.springframework.web.multipart.MultipartFile)

Aggregations

CrawlerBuilder (com.virjar.vscrawler.web.api.CrawlerBuilder)3 VSCrawler (com.virjar.vscrawler.core.VSCrawler)2 CrawlerBean (com.virjar.vscrawler.web.model.CrawlerBean)2 JarFile (java.util.jar.JarFile)2 ClassScanner (com.virjar.vscrawler.core.util.ClassScanner)1 SpringContextAware (com.virjar.vscrawler.web.api.SpringContextAware)1 VSCrawlerClassLoader (com.virjar.vscrawler.web.crawlerloader.VSCrawlerClassLoader)1 ZipFile (java.util.zip.ZipFile)1 MultipartFile (org.springframework.web.multipart.MultipartFile)1