Short instruction how to read logs of Feature - [TFS 1.4] OTS statistics (C++/Lua/SQL) by kondra (https://otland.net/threads/tfs-1-4-ots-statistics-c-lua-sql-by-kondra.283717/)
There are 2 types of logs:
1.
2.
1. CPU usage reports
Example
Nobody is online, server only executes tasks with intervals.
Tasks are sorted by CPU consumption. Only tasks that consumed over 0.1% of relative time are reported.
Let's analyse most CPU consuming task:
First column
Second column
Third column
Fourth column
Fifth column
2. Slow and very slow reports
Information that execution of 1000 player saves took 1000 milliseconds tells that average save takes 1 ms, which gives avg
That's why there are slow action reports, which report single action that executed for over 10/50 milliseconds. 50 ms lags on OTS are clearly visible by anyone online, 10 ms lags are visible as 'not smooth' walking on high levels.
Ex.
Not much to analyse.
What is reported
One important thing: Lua events/C++ functions 'inside' other events/functions are not reported.
Ex. player cast spell UE, it kills player, killed player gets saved in database -> player has big depot, save takes 50 ms.
In
So you got to combine both logs and read between lines to find real lag reason.
There are 2 types of logs:
1.
dispatcher.log
, lua.log
, sql.log
and special.log
, which report total CPU time consumed by given action in last 30 second, grouping all event with same name2.
dispatcher_slow.log
, lua_slow.log
, sql_slow.log
and special_slow.log
(and X_very_slow.log
), which report single action execution time, when it's over X ms (config.lua
, default 10 ms for slow, 50 ms for very slow)1. CPU usage reports
Example
dispatcher.log
content from empty server with 30 second 'dump interval' (statsDumpInterval
in config.lua
):
Code:
[29/01/2023 16:09:07]
Thread: 1 Cpu usage: 0.570128% Idle: 99.6865% Other: -0.256631% Players online: 0
Time (ms) Calls Rel usage % Real usage % Description
154 120 90.27125% 0.51466% std::bind(&Game::checkDecay, this)
16 300 9.41128% 0.05366% std::bind(&Game::checkCreatures, this, (index + 1) % EVENT_CREATURECOUNT)
0 12 0.28179% 0.00161% std::bind(&Game::updateWorldTime, this)
Tasks are sorted by CPU consumption. Only tasks that consumed over 0.1% of relative time are reported.
Let's analyse most CPU consuming task:
Code:
154 120 90.27125% 0.51466% std::bind(&Game::checkDecay, this)
154
is total CPU time used by Game::checkDecay
in last 30 seconds. Time is in milliseconds, so 154
is 0.154
second.Second column
120
is number of calls to given function. It was executed 120 times. It means that 120 Game::checkDecay
calls took 154 ms, which gives 1.28
millisecond per execution (0.00128
second)Third column
90.27125%
is Rel usage
- relative usage. It tells how much CPU time consumes given function compared to total CPU consumed by OTS. Real CPU usage in last 30 seconds was Cpu usage: 0.570128%
, Game::checkDecay
consumed 0.51466%
, which gives 90.27125%
of total CPU consumed by OTS.Fourth column
0.51466%
is Real usage
- real usage compared to single core of your server CPU. Game::checkDecay
executed for 154
ms in last 30 seconds interval, it means it's average CPU usage was 154ms / 30000ms
= 0.51466%
. Google says that 154 / 30000
is 0.5133%
, not 0.51466%
, it's not a bug. OTS stats measure time with nanosecond precision, so value 154
ms is not exactly time measured by OTS stats. For OTS stats it's 154398876 nanoseconds and value presented as percent of CPU uses this high precision value for calculation.Fifth column
std::bind(&Game::checkDecay, this)
is name of function called.2. Slow and very slow reports
Information that execution of 1000 player saves took 1000 milliseconds tells that average save takes 1 ms, which gives avg
0.001
second 'lag' on OTS. For sure something not noticeable by players. It does not show that 999 saves took 100 ms and 1 player save took 900 ms, which lagged whole OTS for 0.9 second.That's why there are slow action reports, which report single action that executed for over 10/50 milliseconds. 50 ms lags on OTS are clearly visible by anyone online, 10 ms lags are visible as 'not smooth' walking on high levels.
Ex.
dispatcher_slow.log
:
Code:
[29/01/2023 16:26:09] Execution time: 32 ms - std::bind(&Game::checkDecay, this) - checkDecay
Game::checkDecay
executed for 32 ms. Probably there are too many items decaying on map ex. fire fields with wrong item ID.What is reported
One important thing: Lua events/C++ functions 'inside' other events/functions are not reported.
Ex. player cast spell UE, it kills player, killed player gets saved in database -> player has big depot, save takes 50 ms.
In
lua.log
it will be reported as data/spells/scripts/attack/ue.lua:onCastSpell
. In dispatcher.log
it will be reported as Game::playerSay
.So you got to combine both logs and read between lines to find real lag reason.
Last edited: