Strange 'Read timed out'

Moderator: developers

Strange 'Read timed out'

Postby alitrix » Thu Jun 21, 2007, 7:30 am

Can somebody plz explain what's going on here?

ERROR 20 Jun 2007 23:58:36 [Listening for slave connections - org.drftpd.master.SlaveManager@1decdec] org.drftpd.master.SlaveManager -
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at com.sun.net.ssl.internal.ssl.InputRecord.readFully(InputRecord.java:293)
at com.sun.net.ssl.internal.ssl.InputRecord.read(InputRecord.java:331)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:722)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1029)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1056)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1040)
at org.drftpd.master.SlaveManager.run(SlaveManager.java:455)
at java.lang.Thread.run(Thread.java:619)


First we thought there is something wrong with MASTER<->SLAVE + SSL
So we turned off SSL between Master&Slave, but still having the same problem.

Greetz
alitrix
 
Posts: 6
Joined: Fri Aug 26, 2005, 6:04 am

Postby zubov » Mon Jun 25, 2007, 12:41 pm

Your socket is timing out, if it happens intermittently with some communication between the master and slave, then it's almost 100% chance a network error.

If it doesn't work at all, it's a configuration/network error.

Describe more in detail how/when it occurs.
zubov
Node's little helper
 
Posts: 1172
Joined: Sat Nov 20, 2004, 7:31 pm
Location: USA

Postby danny » Thu Jul 12, 2007, 9:45 am

Can this timeout be raised, i have the same problem with just 1 slave, sometimes it have small spikes and then master throws it off, and i think with timeout on 30 secs or 60 secs, it wont happend
I have tried "site slave slavename set timeout", it does not apper to have any effect....

Sorry for my bad english :)
danny
Node's little helper
 
Posts: 117
Joined: Tue Jan 03, 2006, 11:50 pm

Postby danny » Thu Jul 12, 2007, 3:50 pm

i think i found it, its located here for slaves Slave.java and here for master SlaveManager.java

Slave.java
Code: Select all
   private static final int socketTimeout = 60000; // 60 seconds, for Socket
   protected static final int actualTimeout = 120000; // 2 minute, evaluated on a SocketTimeout

SlaveManager.java
Code: Select all
   private static final int socketTimeout = 60000; // 60 seconds, for Socket
   
   protected static final int actualTimeout = 120000; // 2 minute, evaluated on a SocketTimeout

Hope this helps you alitrix
danny
Node's little helper
 
Posts: 117
Joined: Tue Jan 03, 2006, 11:50 pm

Postby danny » Fri Jul 13, 2007, 1:22 am

This should be default, or atleast 30 seconds socket timeout, it runs perfect now, no slaves getting kicked off for having connection reset without they are totally dead, and not just for spiking a bit......

Keep up the good work, i love drftpd :)
danny
Node's little helper
 
Posts: 117
Joined: Tue Jan 03, 2006, 11:50 pm

Postby zubov » Fri Jul 13, 2007, 3:59 pm

So you're saying that lowering the timeout values helped you? They're defaulted at 60 seconds from the code snipped you posted.
zubov
Node's little helper
 
Posts: 1172
Joined: Sat Nov 20, 2004, 7:31 pm
Location: USA

Postby danny » Fri Jul 13, 2007, 7:52 pm

Default
Code: Select all
   private static final int socketTimeout = 10000; // 10 seconds, for Socket
   protected static final int actualTimeout = 60000; // one minute, evaluated on a SocketTimeout

Changed
Code: Select all
   private static final int socketTimeout = 60000; // 60 seconds, for Socket
   protected static final int actualTimeout = 120000; // 2 minutes, evaluated on a SocketTimeout
danny
Node's little helper
 
Posts: 117
Joined: Tue Jan 03, 2006, 11:50 pm

Postby zubov » Fri Jul 13, 2007, 8:33 pm

The socketTimeout value there isn't the same as the "drop connection" value. I don't believe it has any bearing on the actual tcp settings at the lower level. That setting is for determining the maximum length of time that a read operation on the socket will wait before returning. You're probably delaying some threads that weren't meant to be delayed 60 seconds...
zubov
Node's little helper
 
Posts: 1172
Joined: Sat Nov 20, 2004, 7:31 pm
Location: USA

Postby danny » Fri Jul 13, 2007, 9:24 pm

actualTimeout just defines the default "site slave set timeout" value as faar as i can see.
And socketTimeout defines how long timeout the slave can timeout when master checks for status.

Before when my slave spiked and master checked it, it timed out and was setted offline, now with the higher timeout value it has some more time to answer back....

The things you talked about with delaying some threads is fixed by changing all your slaves timeout in slave.conf, if you set it to the same as actualTimeout you don't need to set site slave set timeout.

socketTimeout wont delay anything unless the slave timeout/spikes, then clients will be stuck on PASV command untill the slave stops timing out

REMEMBER this is VERRY rare, out of 18 slaves i have only 2 slaves that have so big timeouts so this is nesesary, and they are on the same isp......
danny
Node's little helper
 
Posts: 117
Joined: Tue Jan 03, 2006, 11:50 pm

Postby zubov » Sun Jul 15, 2007, 12:57 pm

I don't think you quite understand the algorithm where the slaves and master communicate. What you're saying isn't exactly true, but whatever works. (Same isp... so it isn't a network error? ;))
zubov
Node's little helper
 
Posts: 1172
Joined: Sat Nov 20, 2004, 7:31 pm
Location: USA

Postby danny » Sun Jul 15, 2007, 6:02 pm

hehe i dont realy care it works, and yes i know its a network problem, what ive done is just to make drftpd less sensitive about it so those slaves wont get kicked off 20+ times a day

And i suspect alitrix have the same problem.

But from time to time all lines even the best isp's have those small spikes timeouts network errors call em what you want, thats why i said it should be default to give it a little more than 10 seconds, ive experienced a MUCH more stable site after i changed it, no slaves gets disconnected without they are really being down... And no bugs....
danny
Node's little helper
 
Posts: 117
Joined: Tue Jan 03, 2006, 11:50 pm


Return to General

Who is online

Users browsing this forum: No registered users and 0 guests