• There is NO official Otland's Discord server and NO official Otland's server list. The Otland's Staff does not manage any Discord server or server list. Moderators or administrator of any Discord server or server lists have NO connection to the Otland's Staff. Do not get scammed!

Solve my servers performance issue

Evolunia

evolunia.net
Joined
Nov 6, 2017
Messages
210
Solutions
3
Reaction score
187
Helllo, i am looking for someone that can solve my servers performance issues

some info about server:
the server i have worked on is mostly a project for me to learn lua and c++, but i have still been very cautious about adding loops and addevent that can ruin my performance. i have looked through all of those now too, and they all seem to be safe. my server contains alot of custom stuff added in lua and in the c++.

some info about the performance issue:
it happens after the server has been online for a while,, for example once it happened maybe after 12h uptime, then once after 5h uptime.

when the issue happens this is what happens to my cpu usage (after around 2h of this happening, cpu usage slowly goes from 30-40% cpu usage to 150+%)
LldBKD3.png

and here's how top perf, looked at both those times when the chart massively goes up

JjGM99g.png


and

NTFEAef.png



and also;; my server was online for around 8 months ago then I think I had the same issue, but it dissapeared when I deleted an onthink script (which i dont have any anymore but also i am not sure if that was the same kind of lag) and after that the issue never happened again. but also since then i did add alot of c++ adjustments and new lua scripts

my server owner can pay $50 for anyone that can help me fix this issue (maybe its a small price for this fix,, but our server dont run donations)

oh and server is based on tfs 1.2 taken from github around 1 year ago
 
Last edited:
0. Does it lag whole server? Can new players connect to server? Can connected players move in game?
1. How many players online?
2. Can you run copy of server to which no one can connect [no players] and check, if it goes high-cpu in same time? Some globalevent or hardware problem.
3. 200% CPU looks weird, as TFS has one thread that 'works' and rest should use around 0% CPU. Maybe second thread is that CPU profiling tool. Did it use 100% or 200% of CPU before you turned on CPU profiler?
4. When it goes high, use 'htop' command, to view how many processes has TFS and which of them use CPU (it show process and all it threads in tree-view).
We can identify them by their PIDs and check which use so much CPU (lowest PID is main 'otserv', then database, network and game threads start is some order).
5. Can you post screen of that 'perf' tool when server works normal, so we can try to compare what's going on.
 
0. Does it lag whole server? Can new players connect to server? Can connected players move in game?

Players can connect, and play but it's very laggy and monsters are laggy too.

1. How many players online?

When this happened it was around 80 players online, but also it happend at around 60 players

2. Can you run copy of server to which no one can connect [no players] and check, if it goes high-cpu in same time? Some globalevent or hardware problem.

I have had same hardware configuration and exact same server scripts with zero players online before online for a couple of weeks, and then there was no such cpu issue. But I think I will get a new server, and put all the stuff there and see if it happens with 0 players again.

3. 200% CPU looks weird, as TFS has one thread that 'works' and rest should use around 0% CPU. Maybe second thread is that CPU profiling tool. Did it use 100% or 200% of CPU before you turned on CPU profiler?

Hmm, I think I was running top the whole time when cpu usage was at 200%, but nothing else I think and i think top shouldnt matter? And I only used CPU profiler for some minutes, so i dont think those spikes are in the charts.

4. When it goes high, use 'htop' command, to view how many processes has TFS and which of them use CPU (it show process and all it threads in tree-view).
We can identify them by their PIDs and check which use so much CPU (lowest PID is main 'otserv', then database, network and game threads start is some order).

if it happens again i will check with htop

5. Can you post screen of that 'perf' tool when server works normal, so we can try to compare what's going on.


vK0AyT6.png


That's how it looks now, 76 players online and 30% CPU usage.
 
TzRDe5R.png


Image how how htop looks with high cpu usage,,

i also compiled a folder with c++ and lua scripts that might be causing the laggs,, but i kinda dont want to share with everyone,, so if someone is interested in looking through those and is very trusted i can share with him,, and i will obviously pay if it fixes my issue!
 
Post image of htop with low CPU, because it looks weird.

When I see all OTS threads go high CPU, I start 'tcpdump' and search for network attack.
 
WYrRAMB.png


How it looks now, 60 players online, restarted for around 1h ago

i will try tcpdump if it happens again but no idea how to make sense of that
 
How it looks now, 60 players online, restarted for around 1h ago
i will try tcpdump if it happens again but no idea how to make sense of that
If it's dedic/VPS, there should be some network stats in hosting company panel. If someone is attacking there must be some 'packets per second' or 'bytes per second' jump in statistics.

EDIT:

'htop' from Kasteria.pl
. As you can see, one thread uses a lot of CPU and second almost nothing.
htop_ots.png


Your OTS 'htop':

htop_ots_probably_attack.png

I don't know what changes you got in your sources, but if network thread does same things as it did in original TFS, it should use few % CPU, not 61.7.
 
Last edited:
mmhmm, there shouldnt be any changes to the network thread i dont think, but there was lots of new functions, and some systems added into my sources, and im complete new to c++ so probably issue is something inside there ^^

anywayys, i applied some pathfinding changes that was available on tfs github, and now cpu usage is less, and since then my cpu usage havent gone super high,, and i havent had to restart the server,
and ontop of this i made shell script, that checkes how much cpu is being used for last 5mins,, and if its too high it will automatically restart my server. i think its good temporary fix and now i dont have to panic about fixing the issue.

it still seems theres some issue that makes cpu usage go higher over time but after changes i did its not as necessary to fix,, and i will try disabling various custom things i added on my testserver and try to locate it that way
 
Do you have some network usage statistics? For me it really looks like some network attack.

I heard that someone wrote changes to TFS engine that move pathfinding to other thread. Do you use his engine? I heard that most of popular BR servers use his engine.
 
Last edited:
the pathfinding changes i did is available in a public pull request on tfs github

annd,, i don't think it's some attack,, because the cpu usage will always eventually increase after some hours of my server being online, for example now it's been 2 weeks,, and pretty much every second day i had to restart my server because of this issue

i have tried changing several of my systems during these two weeks since i last posted,, and nothing has solved the performance issue so far.
another interesting thing that i've found is,, sometimes the cpu usage will jump from around 150% down to 50%, without any restart or anything really happening ingame. And when this happens, overtime certain tiles will become really buggy ingame, for example,, when you walk on certain positions, your character will become completely invisible and even your health and manabar will disapear
here's how that looks:
unknown.png


and the only way for me to fix those bugged positions, is for me to restart the server, i've tried making scripts which cleans those tiles, even completely removes the ground tile and adds a new one,, but it will still be bugged when a player walks onto it. and the positions/tiles are different everytime, it can happen inside spawns, but it mostly occurs in city because I believe that's where most players are.

I will try to make a gif of these tiles when it happens again.

So I am thinking it that the cpu issue and this issue are related in someway,, maybe it's something with using items? Idk,, but maybe somebody reads this and can give some more advice xD

But atm,, my cpu issue is managable,, I just have to restart once a day or something like that,, and players will almost never feel the lagg,, but still it's something i must solve for the future

also here's 1 week graph of some graphs,, i think only the cpu graph is weird
0YNCCXt.png
 
Could it be that your server is using classicAttackSpeed?
 
omg,, yes I am :D it's set to true in config

i also remember applying some of these stuff a really long time ago Fix attack scheduler by ranisalt · Pull Request #1305 · otland/forgottenserver

i also have items which can give attackspeed and stuff like this,,

do you know what's the issue, or how to fix it?? it would help me soo much :eek:
and like i said in first post,, i can pay for a fix too
Okay, and do you have any limit set for how fast a player may attack?

You can also try modifying Player::doAttacking (to something like this):
C++:
void Player::doAttacking(uint32_t)
{
    uint32_t delay = getAttackSpeed();
    if (lastAttack == 0) {
        lastAttack = OTSYS_TIME() - delay - 1;
    }

    if (hasCondition(CONDITION_PACIFIED)) {
        return;
    }

    if ((OTSYS_TIME() - lastAttack) < delay) {
        return;
    }

    bool result = false;
    Item* tool = getWeapon();
    const Weapon* weapon = g_weapons->getWeapon(tool);
    if (weapon) {
        result = weapon->useWeapon(this, tool, attackedCreature);
    } else {
        result = Weapon::useFist(this, attackedCreature);
    }

    if (result) {
        lastAttack = OTSYS_TIME();
        setNextActionTask(nullptr);
        g_scheduler.addEvent(createSchedulerTask(std::max<uint32_t>(SCHEDULER_MINTICKS, delay), std::bind(&Game::checkCreatureAttack, &g_game, getID())));
    }
}
 
there is currently no limit but I should make so there is a limit,, i think fastest players (which is only a few) have around 450ms,, and the average probably has 800 (which is default for promoted,, and 1000ms for non promoted)

Maybe,, it would be good idea for me to increase the default attackspeed to a higher number

But first! I will apply this doAttacking change,, and see if this will improve it,, so in some days i will update this thread and tell you how it went :D And if it works, i fucking love u man :D and i wiilll pay u for this fix like promsied in first post
 
Back
Top