我正在尝试通过Eclipse进行适时的测试,然后按照一些教程进行设置。 我目前停留在nullpointerexception,我相信这是由于找不到regex-urlfilter.txt和regex-normalize.xml引起的。
这是来自日志的错误跟踪:
[LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.conf.Configuration - regex-normalize.xml not found 4473 [LocalJobRunner Map Task Executor #0] WARN org.apache.nutch.net.urlnormalizer.regex.RegexURLNormalizer - Can't load the default rules! 4477 [LocalJobRunner Map Task Executor #0] DEBUG org.apache.nutch.util.ObjectCache - No object cache found for conf=Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, file:/tmp/hadoop-338737067/mapred/local/localRunner/338737067/job_local1524701719_0001/job_local1524701719_0001.xml, instantiating a new object cache 4486 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.conf.Configuration - regex-urlfilter.txt not found 4486 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.MapTask - Starting flush of map output 4516 [LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.util.concurrent.ExecutorHelper - afterExecute in thread: LocalJobRunner Map Task Executor #0, runnable type: java.util.concurrent.FutureTask 4516 [Thread-3] INFO org.apache.hadoop.mapred.LocalJobRunner - map task executor complete. 4521 [Thread-3] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local1524701719_0001 java.lang.Exception: java.lang.NullPointerException at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:491) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:551)
这两个文件都位于\\ workspace \\ apache-nutch-1.16 \\ conf文件夹中,我不确定自己做错了什么。 我已经仔细检查了是否为HADOOP_HOME和HADOOP_BIN正确设置了环境变量,并且它们指向正确的目录。 我不确定他们要查找哪个目录来查找regex-urlfilter.txt和regex-normalize.xml。 解决该问题的任何帮助将不胜感激。
我正在使用Hadoop 3.0.0和apache-nutch-1.16。