Android架构之网络优化

常规的网络框架设计和常用的网络优化方案。

  1. 网络框架OkHttp
  • 简洁易用的接口
  • 拦截器机制,网络重试与跳转
  • 连接池复用
  1. 网络加速
  • HttpDNS与IP直连
  • 连接加速:短连接复用、Http2多路复用、长连接
  1. 数据压缩与序列化
  • Json vs ProtoBuf
  • 压缩算法
  • 序列化
  1. 长连接技术与Mars架构
  • 智能心跳机制
  • 自动重连
  • Android跨进程实现
  • 智能唤醒
  1. 如何应对复杂网络
  • 弱网
  • 网络超时、振荡
  • 404与DNS劫持
  1. 如何保证网络数据安全
  • TLS协议,握手与证书
  • 数据签名及校验

https://github.com/dhhAndroid/RxWebSocket

网络错误

ECONNABORTED

该错误被描述为“software caused connection abort”,即“软件引起的连接中止”。原因在于当服务和客户进程在完成用于 TCP 连接的“三次握手”后,客户 TCP 却发送了一个 RST (复位)分节,在服务进程看来,就在该连接已由 TCP 排队,等着服务进程调用 accept 的时候 RST 却到达了。POSIX 规定此时的 errno 值必须 ECONNABORTED。源自 Berkeley 的实现完全在内核中处理中止的连接,服务进程将永远不知道该中止的发生。服务器进程一般可以忽略该错误,直接再次调用accept。 SocketException: Software caused connection abort: recv failed

1
2
3
4
5
6
7
8
/* Linux system */  

include/asm-alpha/errno.h:#define ECONNABORTED 53 /* Software caused connection
abort */
include/asm-generic/errno.h:#define ECONNABORTED 103 /* Software caused
connection abort */
include/asm-mips/errno.h:#define ECONNABORTED 130 /* Software caused connection
abort */

导致这个异常出现的根本原因可能有多个, 在服务端/客户端单方面关闭连接的情况下,另一方依然以为 tcp连接仍然建立,试图读取对方的响应数据,导致出现 Software caused connection abort: recv failed的异常. 可能是是防火墙的原因。

ECONNRESET

该错误被描述为“connection reset by peer”,即“对方复位连接”,这种情况一般发生在服务进程较客户进程提前终止。当服务进程终止时会向客户 TCP 发送 FIN 分节,客户 TCP 回应 ACK,服务 TCP 将转入 FIN_WAIT2 状态。此时如果客户进程没有处理该 FIN (如阻塞在其它调用上而没有关闭 Socket 时),则客户 TCP 将处于 CLOSE_WAIT 状态。当客户进程再次向 FIN_WAIT2 状态的服务 TCP 发送数据时,则服务 TCP 将立刻响应 RST。一般来说,这种情况还可以会引发另外的应用程序异常,客户进程在发送完数据后,往往会等待从网络IO接收数据,很典型的如 read 或 readline 调用,此时由于执行时序的原因,如果该调用发生在 RST 分节收到前执行的话,那么结果是客户进程会得到一个非预期的 EOF 错误。此时一般会输出“server terminated prematurely”-“服务器过早终止”错误。

EPIPE

错误被描述为“broken pipe”,即“管道破裂”,这种情况一般发生在客户进程不理会(或未及时处理)Socket 错误,继续向服务 TCP 写入更多数据时,内核将向客户进程发送 SIGPIPE 信号,该信号默认会使进程终止(此时该前台进程未进行 core dump)。结合上边的 ECONNRESET 错误可知,向一个 FIN_WAIT2 状态的服务 TCP(已 ACK 响应 FIN 分节)写入数据不成问题,但是写一个已接收了 RST 的 Socket 则是一个错误。

ETIMEDOUT

错误被描述为“connect time out”,即“连接超时”,这种情况一般发生在服务器主机崩溃。此时客户 TCP 将在一定时间内(依具体实现)持续重发数据分节,试图从服务 TCP 获得一个 ACK 分节。当最终放弃尝试后(此时服务器未重新启动),内核将会向客户进程返回 ETIMEDOUT 错误。如果某个中间路由器判定该服务器主机已经不可达,则一般会响应“destination unreachable”-“目的地不可达”的ICMP消息,相应的客户进程返回的错误是 EHOSTUNREACH 或ENETUNREACH。当服务器重新启动后,由于 TCP 状态丢失,之前所有的连接信息也不存在了,此时对于客户端发来请求将回应 RST。如果客户进程对检测服务器主机是否崩溃很有必要,要求即使客户进程不主动发送数据也能检测出来,那么需要使用其它技术,如配置 SO_KEEPALIVE Socket 选项,或实现某些心跳函数。

ENOPROTOOPT

该错误不是一个 Socket 连接相关的错误。errno 给出该值可能由于,通过 getsockopt 系统调用来获得一个套接字的当前选项状态时,如果发现了系统不支持的选项参数就会引发该错误。 getsockopt/setsockopt(2) man page 写道

1
2
3
4
5
6
7
8
9
10
11
getsockopt, setsockopt -- get and set options on sockets.

#include <sys/socket.h>

int getsockopt(int socket, int level, int option_name,
void *restrict option_value, socklen_t *restrict option_len);

int setsockopt(int socket, int level, int option_name,
const void *option_value, socklen_t option_len);

Getsockopt() and setsockopt() manipulate the options associated with a socket. Options may exist at multiple protocol levels; they are always present at the uppermost "socket" level.

此外,getsockopt 和 setsockopt 还可能引发以下错误:

getsockopt/setsockopt(2) man page 写道

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
ERRORS

The getsockopt() and setsockopt() system calls will succeed unless:

[EBADF] The argument socket is not a valid file descriptor.
[EFAULT] The address pointed to by option_value is not in a valid part of the process dress space. For getsockopt(), this error may also be returned if option_len is not in a valid part of the process address space.
[EINVAL] The option is invalid at the level indicated.
[ENOBUFS]Insufficient memory buffers are available.
[ENOPROTOOPT] The option is unknown at the level indicated.
[ENOTSOCK] The argument socket is not a socket (e.g., a plain file).

The setsockopt() system call will succeed unless:

[EDOM] The argument option_value is out of bounds.
[EISCONN]socket is already connected and a specified option cannot be set while this is the case.

ECONNEREFUSED

A “connect failed: ECONNREFUSED (Connection refused)” most likely means that there is nothing listening on that port AND that IP address. Possible explanations include:

  • the service has crashed or hasn’t been started,
  • your client is trying to connect using the wrong IP address or port, or
  • server access is being blocked by a firewall that is “refusing” on the server/service’s behalf. This is pretty unlikely given that normal practice (these days) is for firewalls to “blackhole” all unwanted connection attempts.
  • The server couldn’t send a response: Ensure that the backend is working properly at IP and port mentioned.
  • SSL connections are being blocked: Fix this by importing SSL certificates
  • Cookies not being sent
  • Request timeout: Change request timeout

The java.net.SocketException is thrown when there is an error creating or accessing a socket (such as TCP). This usually can be caused when the server has terminated the connection (without properly closing it), so before getting the full response. In most cases this can be caused either by the timeout issue (e.g. the response takes too much time or server is overloaded with the requests), or the client sent the SYN, but it didn’t receive ACK (acknowledgment of the connection termination). For timeout issues, you can consider increasing the timeout value.

The Socket Exception usually comes with the specified detail message about the issue.

Example of detailed messages:

Software caused connection abort: recv failed.

The error indicates an attempt to send the message and the connection has been aborted by your server. If this happened while connecting to the database, this can be related to using not compatible Connector/J JDBC driver.

Possible solution: Make sure you’ve proper libraries/drivers in your CLASSPATH.

Software caused connection abort: connect.

This can happen when there is a problem to connect to the remote. For example due to virus-checker rejecting the remote mail requests.

Possible solution: Check Virus scan service whether it’s blocking the port for the outgoing requests for connections.

Software caused connection abort: socket write error.

Possible solution: Make sure you’re writing the correct length of bytes to the stream. So double check what you’re sending. See this thread.

Connection reset by peer: socket write error / Connection aborted by peer: socket write error

The application did not check whether keep-alive connection had been timed out on the server side.

Possible solution: Ensure that the HttpClient is non-null before reading from the connection.E13222_01

Connection reset by peer.

The connection has been terminated by the peer (server).

Connection reset.

The connection has been either terminated by the client or closed by the server end of the connection due to request with the request.

What’s causing my java.net.SocketException: Connection reset?

坚持原创技术分享,您的支持将鼓励我继续创作!