Jump to content

Not restarting locked hlds


Brandon

Recommended Posts

We had a hlds process lock up yesterday and it pegged the entire machine - took 5 minutes to finally get to it in task manager to kill it.

 

 

 

Apparently this happened again today but the process was restarted by the customer via control panel.

 

 

 

Any idea why tcadmin might not have restarted the hlds process?

Link to comment
Share on other sites

If the game is still responding to status requests then the server will not restart.

 

 

 

Also, it takes x amount of time for the service to be restarted, so as to avoid possible false restarts when network issues may occur.

 

 

 

I am not sure of the exact timing system that Luis has programmed for restarts. I know it checks a few times before restarting the server. This is to stop a restarting server (something that takes a while to startup, like BF2) from becoming stuck in a loop.

 

 

 

However, if the server was pegged for any longer than 5 minutes or so, I would check your system logs, and game logs, and pass anything you find into a ticket so we can see what happend.

 

 

 

I would also suggest you ask the client to stop whatever they are doing to cause it :smile:

Link to comment
Share on other sites

This process was not restarted at all by tcadmin. It's weird however that a single process pegged at 100% on cpu1 of a dual opteron dual core 285 machine. Would affect the entire machine. It happened again this morning and just now a few minutes ago.

 

 

 

I was unable to find who's server it was because the machine restarted several servers once I was able to kill it.

 

 

 

I'm thinking I might need to put serverdoc in there to monitor the hlds processes because tcadmin is having troubles detecting this crash/lock up.

Link to comment
Share on other sites

Now lock up and taking 100% CPU are different things. If it's taking 100% and can still be queried, then our software will not restart it because it thinks it is still active.

 

 

 

However, if it is truely locked, and the server cannot be queried, then we would need to investigate the cause for the service not being restarted.

Link to comment
Share on other sites

The process is locked up as in locked up causing it to use 100% CPU. The process is not using 100% cpu useage during regular operation. 250 players doesn't even use 10% of the entire machine dual opteron dual core machine. This never occured previous to Tcadmin and all the same game servers are on the machine still. Which leads me to believe that serverdoc was capable of detecting whatever type of lock up that is occuring and restarting it.

 

 

 

I've narrowed it down to 5 servers. I'm fairly certain I know which one it is but can't be sure until it occurs again.

Link to comment
Share on other sites

Obviously the process doesn't normally take 100% CPU. but what you haven't answered is if the gameserver using the 100% is still returning a status query.

 

 

 

TCAdmin does not check CPU usage per single instance, so if the game returns a query response back to the panel then it will not restart it because it still thinks its active.

 

 

 

If serverdoc handles this more to your liking, then maybe you should use it?

 

 

Link to comment
Share on other sites

I apologize if I have insulted you.

 

 

 

Once I know which server is causing the problem I will find out more information. I was unable to find out because I could hardly even move around in RDP. Next time it occurs since I've now narrowed it to 5 servers running on that CPU, I'll find out more information.

 

 

 

My comparision to serverdoc is because we are in the middle of switching from that software to tcadmin. And our hlds server machines are currently on hold until there is some sort of affinity fix. It just seems to me that if a $60/year application can detect these frozen processes then a $16/month application should be able to as well.

 

 

 

Please remember that I am not the only person running game servers that is using this software. Again I am sorry if I insulted your product.

 

 

 

Just my 2 cents.

Link to comment
Share on other sites

It's not insulting to get information about things that are wrong or not working. That is the point of these forums, and the foundation of our company. We pride ourselves on being honest and not hiding anything. If something doesn't work properly we want to correct it not ignore it.

 

 

 

I guess in my own way, I am asking for your help in determining the problem, so it can be corrected. We need to know the cicumstance that are causing it, and why the service isn't restarting.

 

 

 

If my response was deemed as angry or out there, it was not intended that way. My thinking was, if you put serverdoc in place and it runs normally, then we would not get the info back from you to correct the problem.

 

 

 

 

 

 

 

 

 

 

 

 

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue. Terms of Use