首页 > 代码库 > 尝试解决.NET Core Framework中Dns.GetHostAddressesAsync()引起的线程死锁

尝试解决.NET Core Framework中Dns.GetHostAddressesAsync()引起的线程死锁

被这个坑坑得刻骨铭心!先爆一下 corefx 中 System.Net.Dns.GetHostAddressesAsync() 真面目。

public static Task<IPHostEntry> GetHostEntryAsync(IPAddress address){    NameResolutionPal.EnsureSocketsAreInitialized();    return Task<IPHostEntry>.Factory.FromAsync(        (arg, requestCallback, stateObject) => BeginGetHostEntry(arg, requestCallback, stateObject),        asyncResult => EndGetHostEntry(asyncResult),        address,        null);}

接着看看在 Linux 与 Windows 上踩坑的后果。

Linux:

Microsoft.AspNetCore.Server.Kestrel.Internal.Networking.UvException: Error -24 EMFILE too many open files

Windows(1.3万个线程):

技术分享

引发踩坑的代码:

Task<IPAddress[]> task = System.Net.Dns.GetHostAddressesAsync(host);task.Wait(5000);var addresses = task.Result;

上面的代码是在构造函数中调用的,只能同步调用,无法异步调用。

踩坑的条件:在一定数量的请求并发时才出现,如果只有很少的请求不会出现。所以,当我们发布时,将服务器从负载均衡上摘下来,结束进程,更新程序,在本机访问后(host解析已完成)挂上负载均衡,问题不会出现。如果不从负载均衡上摘下来,直接结束 asp.net core 程序的进程,新启动的进程就会出现这个问题。

接下来尝试解决方法。

1)参考 Synchronously waiting for an async operation, and why does Wait() freeze the program here ,将上面的代码改为:

var task = Task.Run(async () => { return await System.Net.Dns.GetHostAddressesAsync(host); });task.Wait(5000);var addresses = task.Result;

死锁问题依旧。

2)参考 System.Data.SqlClient 中的实现:

private static async Task<Socket> ConnectAsync(string serverName, int port){    if (RuntimeInformation.IsOSPlatform(OSPlatform.Windows))    {        var socket = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);        await socket.ConnectAsync(serverName, port).ConfigureAwait(false);        return socket;    }    // On unix we can‘t use the instance Socket methods that take multiple endpoints    IPAddress[] addresses = await Dns.GetHostAddressesAsync(serverName).ConfigureAwait(false);    return await ConnectAsync(addresses, port).ConfigureAwait(false);}

(注:SqlClient中在Windows上没有调用Dns.GetHostAddressesAsync)

将 Dns.GetHostAddressesAsync 放在一个 async/await 代理方法中:

private static async Task<IPAddress[]> GetHostAddressesAsyncProxy(string host){    return await System.Net.Dns.GetHostAddressesAsync(host);}

死锁依旧。 

3)修改 System.Net.Dns 的源代码,将异步方法 

public static Task<IPAddress[]> GetHostAddressesAsync(string hostNameOrAddress){    NameResolutionPal.EnsureSocketsAreInitialized();    return Task<IPAddress[]>.Factory.FromAsync(        (arg, requestCallback, stateObject) => BeginGetHostAddresses(arg, requestCallback, stateObject),        asyncResult => EndGetHostAddresses(asyncResult),        hostNameOrAddress,        null);}

改为同步方法

public static Task<IPAddress[]> GetHostAddressesAsync(string hostNameOrAddress){    NameResolutionPal.EnsureSocketsAreInitialized();    return Task.FromResult<IPAddress[]>(GetHostEntry(hostNameOrAddress).AddressList);}

问题解决!

说明死锁问题的确是由于在构造函数中同步调用异步方法引起的。目前 System.Net.NameResolution 只提供了异步的 API 进行主机名的解析,上面的 GetHostEntry() 是同步方法,但只支持 netstandard 2.0 ,而目前 nuget.org 上的 System.Net.NameResolution 支持到 netstandard 1.3 。

尝试解决.NET Core Framework中Dns.GetHostAddressesAsync()引起的线程死锁