Theorizing a Modern Engine Design

messenoire · May 25, 2020

I understand why you would want to rewrite the entire thing but it's not feasible.

We have to work with what we have, read TFS and OTC.

Ezzz · May 27, 2020

Honestly, we should just stick to C++ and rewrite the entire system in a modern, organized manner, currently TFS is a hilarious mess, and as a developer myself I find it quite annoying to work on that even more after I've lurked into CipSoft' reverse engineered code, which is obviously better organized.

I just took a look at Rust and damn, it might be fast and more safe, but I see it even more complicated than C++ itself.
Lets stay away from Python, I personally don't like it, also it's slow.

jo3bingham · May 27, 2020

messenoire said:
I understand why you would want to rewrite the entire thing but it's not feasible.

We have to work with what we have, read TFS and OTC.

You obviously missed the point of this thread; I have no intention of rewriting anything. The title of the thread is “Theorizing a Modern Engine Design” and the first post gives some ideas and asks others for theirs IF they were to build something from scratch.

But ignoring that, why is it not feasible and why do we have to work with what we have? Not to be rude, but that’s very short-sided thinking. The company I work for was in a similar situation; monolithic codebase, time-consuming to implement new features, difficult to update existing ones. The dev team proposed a new micro-service architecture that would be extremely modular, easier/quicker to update and add new features, and easier to maintain while running more efficiently (faster) than our current system, but we would need to write it from scratch and it would take time. Luckily, the owner is a level-headed person and understood the benefits. However, if we would have been told “it’s not feasible” and “to work with what we have”, I’m not sure a lot of the team would have stuck around.

Ezzz said:
Honestly, we should just stick to C++ and rewrite the entire system in a modern, organized manner, currently TFS is a hilarious mess, and as a developer myself I find it quite annoying to work on that even more after I've lurked into CipSoft' reverse engineered code, which is obviously better organized.

I just took a look at Rust and damn, it might be fast and more safe, but I see it even more complicated than C++ itself.
Lets stay away from Python, I personally don't like it, also it's slow.

Yeah, Rust has a very high learning curve, but, as mainly a C++ dev myself, I picked it up fairly quickly once I actually sat down and started writing code. Looking at Rust code without knowing the syntax is daunting, but once you know it it’s actually easier to read than something like JavaScript (imo).

messenoire · May 27, 2020

A company is making money unlike in OSS where there's few guys with their limited resources working for free. Rewrite in Rust^TM if you must but we'll just end up with tfs and a prototype in Rust. We're 80% there with tfs, it compiles, it works, it's just bloated and annoying to work on.

Lessaire · May 27, 2020

jo3bingham said:
However, if we would have been told “it’s not feasible” and “to work with what we have”, I’m not sure a lot of the team would have stuck around.

Absolutely spot on with that. There comes a time when its better to nuke and pave, because repairing the cruft of yesterday is actually a less efficient path. And one can't really gauge if that critical mass has been reached without a thread like this.

I know one thing that should be an imperative in a from-scratch rewrite: the web portal, if it interfaces with the database at all, should do so from a read-only user.
The server should have:
Traditional RESTful API providing JSON structures, so the webportal can ask the server to make changes, and never ever ever even think about touching the database itself.
Webhook support for pushing updates to a new breed of serverlist site.

GraphQL would be lovely, but the C++ support just isn't there yet. This is one area where Rust would provide an advantage, via Juniper.

Yamaken · May 27, 2020

Lessaire said:
Absolutely spot on with that. There comes a time when its better to nuke and pave, because repairing the cruft of yesterday is actually a less efficient path. And one can't really gauge if that critical mass has been reached without a thread like this.

I know one thing that should be an imperative in a from-scratch rewrite: the web portal, if it interfaces with the database at all, should do so from a read-only user.
The server should have:
Traditional RESTful API providing JSON structures, so the webportal can ask the server to make changes, and never ever ever even think about touching the database itself.
Webhook support for pushing updates to a new breed of serverlist site.

GraphQL would be lovely, but the C++ support just isn't there yet. This is one area where Rust would provide an advantage, via Juniper.

I agree with you. The engine should have some kinda of rest api so others can remote call to query and alter the engine state(like a remote panel).

I also think the engine should not have any code related to sql or any storage backend people may chose. It should access the database using a rest api in a different service in order to achieve few great features:

Async requests: Better performance. The engine won't waste time waiting for an answer from the database.
Database can be hosted in a different machine/host. Sync queries demands the database to be together with the engine in the same machine.
Clear and simple interface. For the user(engine or the website), the request can be very simple, while in the backend it may need to retrieve data from different tables,database, another service, etc.
Requests can be sent in a bulk fashion?
Clear separation between the model(database, sql or nosql) from the engine.
Interactions between the website and the engine could be resolved here. Like the website and the engine changing player's balance.

Its similar to how cipsoft has made with querymanager. querymanager is also necessary if you want to have one account with characters from different servers which are hosted very far from each other(like a host in brazil and a host in France).

Xikini · May 27, 2020

Lessaire said:
Absolutely spot on with that. There comes a time when its better to nuke and pave, because repairing the cruft of yesterday is actually a less efficient path. And one can't really gauge if that critical mass has been reached without a thread like this.

I know one thing that should be an imperative in a from-scratch rewrite: the web portal, if it interfaces with the database at all, should do so from a read-only user.
The server should have:
Traditional RESTful API providing JSON structures, so the webportal can ask the server to make changes, and never ever ever even think about touching the database itself.
Webhook support for pushing updates to a new breed of serverlist site.

GraphQL would be lovely, but the C++ support just isn't there yet. This is one area where Rust would provide an advantage, via Juniper.

-------------------------
My eyes :C

J.Dre · May 28, 2020

I’ve read through the thread. Seems like Rust has a popular presence. Know very little about it myself. It does seem quite daunting at first glance. Truth be told, most things are in the coding world. If someone were to start a project to reinvent TFS using Rust, I’d support the effort.

The documents I’ve read and videos I’ve watched to learn a bit about it seem to suggest it’s extremely efficient and would be ideal for what we hope to accomplish.

Ezzz · May 28, 2020

Any programmers out there willing to code a login server sample on Rust? Maybe then it could get some programmers attention into learning it if we're leaning towards Rust.

Xikini said:
-------------------------
My eyes :C

Use Dark Theme on your otland configs haha.

Lessaire · May 28, 2020

Yes, fix the "objectively wrong theme selected" bug, and the blinding light will take care of itself.

Yamaken said:
Clear separation between the model(database, sql or nosql) from the engine.
Interactions between the website and the engine could be resolved here. Like the website and the engine changing player's balance.

Decoupling the database driver into a microservice is definitely achievable, but I don't think it really buys as much as you believe. It's a headless horseman.

Clean separation of model and presentation? That is a fever dream. Domain Driven Design rarely works in games, and in the cases that it does, it's only by the sheer willpower of a lead engineer. It would never work in a FOSS project that relies on hobby time. Real-time games are basically the very manifestation of boundary leakage. Decoupling the persistence of state is one thing, but divorcing it is not possible. At least not in any fashion that will give the players good UX.

I don't think we really can look to Cip's infrastructure as a role model here. By the time you need topology-aware geo-partitioned multi-region database clusters with observer replicas to alleviate initial fail-over latency, you are so far beyond the capabilities of an informal collective of hobby MMO engine developers it's comical.

I think this REST microservice idea is beginning to grow on me... if only because it makes what I think is a far more important goal less problematic: Bound variable prepared statements. There is a mountain of execution plan performance just sitting there on the table for the taking.

Xikini · May 28, 2020

I will never stop using the light theme. lol

jo3bingham · May 28, 2020

Ezzz said:
Any programmers out there willing to code a login server sample on Rust? Maybe then it could get some programmers attention into learning it if we're leaning towards Rust.

Use Dark Theme on your otland configs haha.

The following is roughly 20 lines (ignoring the comments) of extremely simple code. You simply have to add tokio as a dependency to your Cargo.toml file to build it; e.g., tokio = { version = "0.2.21", features = ["full"] }. Obviously, you'd want a more robust system; a "NetworkMessage" module for constructing packets, another one for interfacing with a database, etc.

Note that I tested this with a 7.6 client. Also, there isn't a Rust syntax highlight so I used JavaScript.

JavaScript:

// Using these async traits allows us to read/write from/to sockets concurrently.
use tokio::io::{AsyncReadExt, AsyncWriteExt};
use tokio::net::{TcpListener, TcpStream};

// Our process function is marked as async so that we can concurrently read/write the socket.
// We mark the socket parameter as mutable so that we can read/write from/to it.
// We return a Result, so the calling function expects either an Ok or an Err.
async fn process(socket: &mut TcpStream) -> std::io::Result<()> {
    // The following code reads an incoming packet into a local buffer.
    // Create a mutable array that can hold two bytes for the packet size. By default, variables in Rust are immutable.
    // You have to explicitly declare a variable to be mutable in order to modify it. This is one of Rust's many safety features.
    // For those that are unaware, the first two bytes of every packet is the length of that whole packet (not including those two bytes).
    let mut size = [0u8; 2];
    // We can await this async call so that every connection can be processed concurrently. What that means is, reading data from a
    // network socket can be blocking because it has to wait for that data to be available. We're calling `read_exact` which won't
    // return until it has filled the buffer (in this case it will wait for 2 bytes to be available before returning).
    // I can expand on this topic for anyone curious.
    socket.read_exact(&mut size).await?;
    // Tibia uses little-endian bitness for their packet structures. That means bytes are arranged from lowest to highest.
    // For example, let's say there's a 16-bit section of the packet that has a value of 10 (0x0A). It would look like 0x0A 0x00 in the packet,
    // not 0x00 0x0A. So, we can simply use the `from_le_bytes` method to convert our two-byte-array into a 16-bit integer.
    let size = u16::from_le_bytes(size);
    // Now that we know how many bytes are left in the packet, we can create a buffer to hold it.
    let mut buffer = Vec::<u8>::with_capacity(size.into());
    // Even though we've created a buffer that has allocated the space to hold the data, the length is technically zero.
    // So we have to fill it so that our call to `read_exact` knows how many bytes to read.
    buffer.resize(size.into(), 0);
    // Again, we await for asynchronous magic.
    socket.read_exact(&mut buffer).await?;
    // Simply print out the size of the packet and the data.
    println!("{}:{:?}", size, buffer);

    // The following code sends a manually constructed character list packet back to the client and closes the connection.
    // The length of our packet (not including the two bytes for the size) is 20 (0x14) bytes. We simply denote that in little-endian form (0x14 0x00).
    // 0x64 tells the client that the preceding data is in the structure of a character list packet.
    // 0x01 is the number of characters the client should expect to read.
    // 0x04 0x00 0x54 0x65 0x73 0x74 is the character name. Like the packet size, Tibia uses a 16-bit integer (0x04 0x00) to denote the size of a string.
    // Which is then followed by each character represented as a byte. In this case, our character's name is Test.
    // 0x02 0x00 0x4F 0x54 is the world name. Our world name is simply OT.
    // 0x7F 0x00 0x00 0x01 is the IP address converted from dot-notation (127.0.0.1) to four bytes representing each value.
    // 0x04 0x1C is the port (7172).
    // Then, finally, 0x01 0x00 indicates the number of premium days left on the account.
    let character_list = [0x14, 0x00, 0x64, 0x01, 0x04, 0x00, 0x54, 0x65, 0x73, 0x74, 0x02, 0x00, 0x4F, 0x54, 0x7F, 0x00, 0x00, 0x01, 0x04, 0x1C, 0x01, 0x00];
    println!("Sending character list");
    // Similar to `read_exact`, `write_all` won't return until every byte in the buffer has been written to the socket. So we await for concurrency.
    // The lack of a semicolon here allows us to propogate the return of `write_all` back to the caller. The same thing is happening with the use of ?.
    // ? tells the code to continue if the called function returns Ok, or to return early if it's an Err.
    socket.write_all(&character_list).await
}

// This tells the compiler that this is the main entrypoint for the tokio runtime.
#[tokio::main]
// We mark this as async so we can await incoming connections.
async fn main() -> std::io::Result<()> {
    println!("Starting login server...");
    // For simplicity sake, we'll just bind a TCP listener to our localhost and the standard login port that Tibia/OT uses.
    // That means, to test this, you can only connect to this server with a client on your machine through port 7171.
    let mut listener = TcpListener::bind("127.0.0.1:7171").await?;

    println!("Login server ready");
    // Start an endless loop to continuously accept and process incoming connections.
    loop {
        // Await accepting an incoming connection so as not to block, and allow other connections to be accepted and processed concurrently.
        let (mut socket, addr) = listener.accept().await?;

        // Spawn an asynchronous task to process incoming connections. Tasks allow processing to happen asynchronously/concurrently without the need for separate threads.
        // We move our socket and addr variables to the task
        tokio::spawn(async move {
            println!("Processing client connection for {:?}", addr);
            // By passing the socket as a reference, we're able to continue using it once `process` has returned.
            // If `process` didn't accept a value by reference, we would have had to move the socket and lose ownership of it.
            process(&mut socket).await?;
            // Once the data from the client has been processed there's no need to keep the connection alive.
            socket.shutdown(std::net::Shutdown::Both)
        });
    }
}

Output:

Code:

Starting login server...
Login server ready
Processing client connection for V4(127.0.0.1:50674)
24:[1, 2, 0, 248, 2, 51, 90, 157, 67, 190, 82, 152, 67, 117, 129, 230, 66, 1, 0, 0, 0, 1, 0, 49]
Sending character list

Yamaken · May 28, 2020

Lessaire said:
Decoupling the database driver into a microservice is definitely achievable, but I don't think it really buys as much as you believe. It's a headless horseman.

You mean a mula sem cabeça?

Headless Mule - Wikipedia

en.wikipedia.org

Lessaire said:
Clean separation of model and presentation? That is a fever dream. Domain Driven Design rarely works in games, and in the cases that it does, it's only by the sheer willpower of a lead engineer. It would never work in a FOSS project that relies on hobby time. Real-time games are basically the very manifestation of boundary leakage. Decoupling the persistence of state is one thing, but divorcing it is not possible. At least not in any fashion that will give the players good UX.

I don't understand anything you said here, but i think my aim is to decouple persistence and build a clean interface that can be used for both the game engine and website code. I just don't want the engine touching the sql/persistence code and i want it to be async.
Another major issue is how the website commands interacts with the engine, you just can't be sure that they will overwrite the same database data thus creating a bug/miss behavior. I hate the fact that the character needs to be offiline so the website can do many things and i hate how hacky is the communication between engine and website.

Lessaire said:
I don't think we really can look to Cip's infrastructure as a role model here. By the time you need topology-aware geo-partitioned multi-region database clusters with observer replicas to alleviate initial fail-over latency, you are so far beyond the capabilities of an informal collective of hobby MMO engine developers it's comical.

I don't think you understand what i said before. A centralized service that handles account data so you can have the same account for characters in different hosts/instances/game engine. The world/instance data(including characters) would be near or together the instance/host/game engine.

Lessaire said:
I think this REST microservice idea is beginning to grow on me... if only because it makes what I think is a far more important goal less problematic: Bound variable prepared statements. There is a mountain of execution plan performance just sitting there on the table for the taking.

Elaborate more. Like building the whole data for a request like saving a character and then send it as whole in one rest api call?

Lessaire · May 29, 2020

Yamaken said:
You mean a mula sem cabeça?

I mean, the headless horseman has his dead detached. But he still has to carry it everywhere. So what has it gained him?

Yamaken said:
I don't understand anything you said here, but i think my aim is to decouple persistence and build a clean interface that can be used for both the game engine and website code. I just don't want the engine touching the sql/persistence code and i want it to be async.
Another major issue is how the website commands interacts with the engine, you just can't be sure that they will overwrite the same database data thus creating a bug/miss behavior. I hate the fact that the character needs to be offiline so the website can do many things and i hate how hacky is the communication between engine and website.

Yes, a single dedicated caller to the database makes sense, and prevents breakage of the "single source of truth" principle which is what this amounts to. Game active state in engines allocated memory becomes different than persisted state in database altered by web portal. Preventing this is 100% good idea.

What I'm saying is also separating that from the engine makes a new layer analogous to the headless horseman's head: The separation of activate state and persisted state will never be clean here because this is a real-time game, not a shopping portal, and so this new microservice is really a disembodied appendage of the game engine, the game engine is now the headless horseman, carrying around his own head. Because while the game itself is active, its allocated memory represents the truth. This is a situation other software classes don't have to deal with, which is why domain driven design works there and not here: There the truth is only ever the database itself.

However, I think we absolutely should lop off it's head. We just need to make sure this dispatcher is cephalophore instead of a Galloping Hessian:

Yamaken said:
Elaborate more

In essence, by making a SQL dispatcher more tightly coupled with MySQL, performance could be increased by up to 200% by giving the query planner all the help it can get. And a second dispatcher could be made to achieve the same for Postgres. (There is where the headless horseman finds his calling: switching heads.) The database code does use parameterized queries for a couple things, but conversion to that style is incomplete and it also does not seem to be generating reusable prepared statements or bound variables. You can read more here, here, here and here. You can read about where this still isn't good enough for some situations here. Currently TFS falls woefully short in this arena. I can tell that most of the work done on this part of TFS was by those who've been highly segregated from DBA concerns.

Ezzz · May 29, 2020

CipSoft interface although the 2005 one did not have SQL management on the server side, but rather a socket connection to request queries to the actual SQL manager server.
Keeping players in memory is also a lot better since during the day we wouldn't care at all about lags during server save, however, for this task stability is a must, otherwise it would mean a 24 hour roll back...

Yamaken · May 30, 2020

Lessaire said:
I mean, the headless horseman has his dead detached. But he still has to carry it everywhere. So what has it gained him?

Yes, a single dedicated caller to the database makes sense, and prevents breakage of the "single source of truth" principle which is what this amounts to. Game active state in engines allocated memory becomes different than persisted state in database altered by web portal. Preventing this is 100% good idea.

What I'm saying is also separating that from the engine makes a new layer analogous to the headless horseman's head: The separation of activate state and persisted state will never be clean here because this is a real-time game, not a shopping portal, and so this new microservice is really a disembodied appendage of the game engine, the game engine is now the headless horseman, carrying around his own head. Because while the game itself is active, its allocated memory represents the truth. This is a situation other software classes don't have to deal with, which is why domain driven design works there and not here: There the truth is only ever the database itself.

However, I think we absolutely should lop off it's head. We just need to make sure this dispatcher is cephalophore instead of a Galloping Hessian:

In essence, by making a SQL dispatcher more tightly coupled with MySQL, performance could be increased by up to 200% by giving the query planner all the help it can get. And a second dispatcher could be made to achieve the same for Postgres. (There is where the headless horseman finds his calling: switching heads.) The database code does use parameterized queries for a couple things, but conversion to that style is incomplete and it also does not seem to be generating reusable prepared statements or bound variables. You can read more here, here, here and here. You can read about where this still isn't good enough for some situations here. Currently TFS falls woefully short in this arena. I can tell that most of the work done on this part of TFS was by those who've been highly segregated from DBA concerns.

I'm aware of this "issue": the game engine memory has the truth, while the database is just used for persistence. But amount of data which the engine is the truth is limited. Its data that needs to be read/write so fast/so many times that it must be stored in game engine memory, but not every game data needs to be in memory like in tfs's market system: the database is the truth for the market data, there is no market data in memory because there is no need for that(and danm, tfs does not even use async sql here even if its not that hard and the current async sql system in tfs is able to do this perfectly).

My "fix" to this is issue to use the "querymanager"(the microserver) for everything related to sql/database. If its engine truth memory data, the engine should have also a rest api so the data can be changed. This data can only be changed by the engine(which will update the database with the correct data) so there is no conflict between website(or other services) and the engine. With this i can solve the classic(for me) player bank balance issue: If the website wants to remove the house bid amount from player bank balance, it will call the engine to remove the balance, so the engine can have the player bank balance in memory(which atleast in newer tibia versions, is used everywhere) while any other service can change it with 0 issues(besides the engine being offline, but not a big problem i think).

LordCompi · May 30, 2020

@Yamaken well spoken, I have worked a lot on disassembling cip binaries and the fact is that quene which Cip used is saint grall for tfs.
Quene not task list as it is now in tfs, everything must be executed in loop, not in async way. Look how the game loop of cip engine works, everything is quened. As I have spoken with Yamaken on priv, disassembling and recreating mechanics from leaked engine may make this new modern server perfect

Ezzz · May 31, 2020

LordCompi said:
@Yamaken well spoken, I have worked a lot on disassembling cip binaries and the fact is that quene which Cip used is saint grall for tfs.
Quene not task list as it is now in tfs, everything must be executed in loop, not in async way. Look how the game loop of cip engine works, everything is quened. As I have spoken with Yamaken on priv, disassembling and recreating mechanics from leaked engine may make this new modern server perfect

Asynchronous is always faster, we really shouldn't focus everything entirely on a 2005 game server that was mostly coded on C, but the structure of CipSoft is definitely ten times better even while being a 15 year old code.

Cip structure looks a bit like this:

Game server loop with asynchronous tasks for different tasks such as player load order, player store order (keep players in memory), map cycle refresh, these asynchronous orders can be obtained their results upon next game cycle, just like in Cip Tibia with some timeout interval.
Game server only sends query requests (socket) to the query manager and obtain such result upon next game cycle to not stop the game world from looping.
Query manager like in CipSoft, external program to manage SQL requests based on game world order ID (support for multi worlds).
Program for login server which also connects to the query manager.

Yamaken · May 31, 2020

LordCompi said:
@Yamaken well spoken, I have worked a lot on disassembling cip binaries and the fact is that quene which Cip used is saint grall for tfs.
Quene not task list as it is now in tfs, everything must be executed in loop, not in async way. Look how the game loop of cip engine works, everything is quened. As I have spoken with Yamaken on priv, disassembling and recreating mechanics from leaked engine may make this new modern server perfect

By async, i think Lord means TFS tasks. Yeah, we should change it for a game loop which gives more control, less headashe, new opportunity for performance improvements and very important predictability.

Ezzz said:
Asynchronous is always faster, we really shouldn't focus everything entirely on a 2005 game server that was mostly coded on C, but the structure of CipSoft is definitely ten times better even while being a 15 year old code.

Cip structure looks a bit like this:

Game server loop with asynchronous tasks for different tasks such as player load order, player store order (keep players in memory), map cycle refresh, these asynchronous orders can be obtained their results upon next game cycle, just like in Cip Tibia with some timeout interval.

Game server only sends query requests (socket) to the query manager and obtain such result upon next game cycle to not stop the game world from looping.

Query manager like in CipSoft, external program to manage SQL requests based on game world order ID (support for multi worlds).

Program for login server which also connects to the query manager.

Async design is the best right now. Async for files, network and the QueryManager(rest api).
Support for multi worlds is very nice and if we use a QueryManager as a layer for the database it comes almost as free. For better organization, an general QueryManager(for all servers and mostly accounts data handling) and a per server QueryManager. Each server engine should also have a rest api so external services can change the engine internal state. I think we should not use a binary custom protocol for the QueryManager, we should use json or protobuf(json being the best candidate now).

zuber966 · May 31, 2020

@Yamaken in few days from now I will start a open source project in java which will expose rest api through which user will be able to update tfs database. If one day tfs is able to create rest api calls, then this might become useful. I'll try maintain all communication async. Still thinking about authorization method, but JWT is best option I think.

Theorizing a Modern Engine Design

Well-Known Member

Developer of Nostalrius and The Violet Project

Excellent OT User

Well-Known Member

Omniscient Hypervisor

Pro OpenTibia Developer

Xikini

Guest

Unity Games

Developer of Nostalrius and The Violet Project

Omniscient Hypervisor

Xikini

Guest

Excellent OT User

Pro OpenTibia Developer

Omniscient Hypervisor

Developer of Nostalrius and The Violet Project

Pro OpenTibia Developer

Banned User

Developer of Nostalrius and The Violet Project

Pro OpenTibia Developer

Member

Similar threads