简介
本文以使用URLStreamHandler来将Hadoop文件系统中的文件显示的标准输出的例子,来讲解如何定位和解决Connection Refused和FileNotFoundException问题。本例子中使用的Hadoop 2.7.3版本。
URLCat实例
下面的例子是使用URLStreamHandler将Hadoop文件系统中的文件输出的标准输出中。
// cc URLCat Displays files from a Hadoop filesystem on standard output using a URLStreamHandler
import java.io.InputStream;
import java.net.URL;
import org.apache.hadoop.fs.FsUrlStreamHandlerFactory;
import org.apache.hadoop.io.IOUtils;
// vv URLCat
public class URLCat {
static {
URL.setURLStreamHandlerFactory(new FsUrlStreamHandlerFactory());
}
public static void main(String[] args) throws Exception {
InputStream in = null;
try {
in = new URL(args[0]).openStream();
IOUtils.copyBytes(in, System.out, 4096, false);
} finally {
IOUtils.closeStream(in);
}
}
}
使用jar命令将URLCat打包到hadoop-example.jar文件中。命令执行
通过hadoop fs -put命令可以将本地文件拷贝到HDFS文件系统中,并确保文件拷贝成功:
hadoop@bob-virtual-machine:~$ hadoop fs -ls
Found 3 items
drwxr-xr-x - hadoop supergroup 0 2017-07-07 11:16 input
-rw-r--r-- 1 hadoop supergroup 530 2017-08-28 12:35 sample.txt
drwxr-xr-x - hadoop supergroup 0 2017-08-29 04:17 test
在执行前需要将jar文件添加到CLASSPATH中,然后可以通过hadoop执行命令了。hadoop@bob-virtual-machine:~$ export HADOOP_CLASSPATH=/home/hadoop/github/hadoop-book/hadoop-examples.jar
hadoop@bob-virtual-machine:~$ hadoop URLCat hdfs://localhost/sample.txt
Exception in thread "main" java.net.ConnectException: Call From bob-virtual-machine/127.0.1.1 to localhost:8020 failed on connection exception:
java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
分析异常的log,发现hadoop缺省连接了8020端口,Call From bob-virtual-machine/127.0.1.1 to localhost:8020 failed on connection exception。异常信息显示并没有打开8020端口,因此我们需要查看一下相应的服务是否正常启动,这里可以使用jps命令:hadoop@bob-virtual-machine:~$ jps
23241 SecondaryNameNode
22894 NameNode
23022 DataNode
如果相应Java进程存在,则说明配置的端口号不是缺省8020端口。可以查看etc/haoop/core-site.xml文件: <property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
可以查看到配置端口为9000. 当然,也可以通过netstat命令查看当前端口使用情况信息,我们也能发现8002端口没有被使用。确认端口问题之后,我们继续执行相应的命令,但发现新的FileNotFoundException异常
hadoop@bob-virtual-machine:$ hadoop URLCat hdfs://localhost:9000/sample.txt
Exception in thread "main" java.io.FileNotFoundException: File does not exist: /sample.txt
at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
这种情况下,应该和文件的路径错误有关,可以通过相关命令来查看:hadoop@bob-virtual-machine:~$ hadoop fs -ls hdfs://localhost:9000
Found 3 items
drwxr-xr-x - hadoop supergroup 0 2017-07-07 11:16 hdfs://localhost:9000/user/hadoop/input
-rw-r--r-- 1 hadoop supergroup 530 2017-08-28 12:35 hdfs://localhost:9000/user/hadoop/sample.txt
drwxr-xr-x - hadoop supergroup 0 2017-08-29 04:17 hdfs://localhost:9000/user/hadoop/test
根据上面显示的信息,我们响应的命令需要修改为: hadoop URLCat hdfs://localhost:9000/user/hadoop/sample.txt
参考资料
1. Hadoop Wiki: https://wiki.apache.org/hadoop/ConnectionRefused