serialize_lib does not open any files in binary mode. This leads to
at least incompatibility of save files between windows and linux and
at worst a completely broken nodedb load on windows.
`mintrain` was copied to the linux guest as `mintrain_woe32` and loaded.
The initial load failed but advtrains successfully re-wrote the nodedb.
* See appendix 1 for a full log
The station/stop rail information, the signals and TCBs all had info lost.
On first opening the ops panel the environment and code embedded in the panel
appeared to be preserved, but actually were lost after punching the ops panel;
they were presumably only visible by the formspec information but were actually
lost.
On linux, the mintrain world loaded all nodes into netmapper sucessfully and
rendered them correctly. On windows, only a few nodes are visible. Running the
debugger seems to show that the deserializer skips almost to the end of the file
when deserializing the content ids, leaving only a few nodes actualy loaded.
At one stage, the `mintrain` world only had 4 nodes serialized under windows;
later the full set was written. However, the fact that only a small amount of
nodes ever get de-serialized persisted as an issue, even in a tiny LuaJIT
interactive session:
> f, err = io.open("F:\\Non-installs\\minetest-5.2.0-win64\\worlds\\mintrain\\advtrains_ndb4.ls")
> print(err)
nil
> print(f:seek())
0
> a = f:read(1)
> print(string.byte(a))
> print(f:seek())
605
This issue manifested itself on the latest LuaJIT git version, and on
the earlier
2.0.5 source package. It manfested itself when compiling with the original MinGW
toolchain. It manifested itself when cross-compiling with the mingw64 toolchain
on linux. It manifested itself when compiling with MSVC using LuaJIT's
msvcbuild.bat. It did not manifest on luajit running on Linux. It manifested for
MinGW under Windows 10 as well (my usual test environment is Windows 7). The
issue also manifested with the mainstream lua project's version of lua.
I ran these various compiles of LuaJIT through gdb while running netmapper
on the world in question. I found the issue was occuring at
luajit/src/lib_io.c:178 in a call to fread, where the contents of the file
pointer suddenly pointed ahead to byte 605. At this point I knew I couldn't
step inside the implementation of fread as it's probably part of a Windows dll.
Suspecting a bug in fread, I then wrote a small C program and tested it
against the nodedb:
#include <stdio.h>
int main(int argc, char *argv[]) {
char buf = '\0';
size_t itemsRead;
FILE *f = fopen("advtrains_ndb4.ls", "r");
printf("File pos on open: %d\n", ftell(f));
itemsRead = fread(&buf, 1, 1, f);
printf("File pos after fread(1,1): %d\n", ftell(f));
printf("Result on close: %d\n", fclose(f));
return 0;
}
Output:
File pos on open: 0
File pos after fread(1,1): 605
Result on close: 0
I then replaced the second argument to fopen with "rb":
File pos on open: 0
File pos after fread(1,1): 1
Result on close: 0
Showing that windows indeed has a severe bug in fread in text mode.
Note that in the nodedb a CR-LF occurs way, way before offset 605 into the file.
fread would be incorrect even if it were operating line-wise.
The only recourse I can see now is that we must move all file operations to
binary mode, and attempt to fix all advtrains *.ls files with an upgrade process
that is similar to running dos2unix on all of the files.
Always opening files in binary mode is a lesson I have learned from earlier
projects but not with such disastrous consequences. It's understandable orwell
may not have been aware of such issues since he seems to work solely on Linux.
Appendix 1: debug.txt of advtrains loading data previously saved on windows,
then run on linux
021-03-04 14:27:04: ACTION[Main]: [advtrains] Loading...
2021-03-04 14:27:04: WARNING[Main]: register_platform couldn't find
preset node default:stonebrick
2021-03-04 14:27:04: WARNING[Main]: register_platform couldn't find
preset node default:sandstonebrick
2021-03-04 14:27:04: WARNING[Main]: Undeclared global variable
"mesecon" accessed at
...e/debian/.minetest/mods/advtrains/advtrains/init.lua:223
2021-03-04 14:27:04: ACTION[Main]: [advtrains] Loaded in 13.415999999999ms
2021-03-04 14:27:04: WARNING[Main]: Undeclared global variable
"mesecon" accessed at
...netest/mods/advtrains/advtrains_train_track/init.lua:326
2021-03-04 14:27:04: WARNING[Main]: Undeclared global variable
"intllib" accessed at
...test/mods/advtrains/advtrains_luaautomation/init.lua:6
[minimal][modchannels] Code loaded!
2021-03-04 14:27:04: ACTION[Main]: Yeah experimental loaded mods.
2021-03-04 14:27:04: ACTION[Server]: [advtrains] loading
/home/debian/.minetest/worlds/mintrain_woe32/advtrains_version
2021-03-04 14:27:04: ACTION[Server]: [advtrains] loading savefiles
version 4 (serialize_lib)
2021-03-04 14:27:04: WARNING[Server]: [serialize_lib] (?): Loading
data from file '/home/debian/.minetest/worlds/mintrain_woe32/advtrains_core.ls'
failed, trying .new fallback:
2021-03-04 14:27:04: WARNING[Server]: [serialize_lib] (?):
....minetest/mods/advtrains/serialize_lib/serialize.lua:20' instead!d
header, got 'LUA_SER v=1
2021-03-04 14:27:04: ERROR[Server]: [serialize_lib] (?): Unable to
load data from '/home/debian/.minetest/worlds/mintrain_woe32/advtrains_core.ls.new':
2021-03-04 14:27:04: ERROR[Server]: [serialize_lib] (?):
/home/debian/.minetest/worlds/mintrain_woe32/advtrains_core.ls.new: No
such file or directory
2021-03-04 14:27:04: ERROR[Server]: [serialize_lib] (?): Note: This
message is normal when the mod is loaded the first time on this world.
2021-03-04 14:27:04: ACTION[Server]: [advtrains][nodedb] read 21 node
content ids.
2021-03-04 14:27:04: WARNING[Server]: [serialize_lib] (?): Loading
data from file '/home/debian/.minetest/worlds/mintrain_woe32/advtrains_ndb4.ls'
failed, trying .new fallback:
2021-03-04 14:27:04: WARNING[Server]: [serialize_lib] (?):
...debian/.minetest/mods/advtrains/advtrains/nodedb.lua:141: NDB file
is corrupted (found entry with invalid cid)
2021-03-04 14:27:04: ERROR[Server]: [serialize_lib] (?): Unable to
load data from '/home/debian/.minetest/worlds/mintrain_woe32/advtrains_ndb4.ls.new':
2021-03-04 14:27:04: ERROR[Server]: [serialize_lib] (?):
/home/debian/.minetest/worlds/mintrain_woe32/advtrains_ndb4.ls.new: No
such file or directory
2021-03-04 14:27:04: ERROR[Server]: [serialize_lib] (?): Note: This
message is normal when the mod is loaded the first time on this world.
2021-03-04 14:27:04: WARNING[Server]: [serialize_lib] (?): Loading
data from file '/home/debian/.minetest/worlds/mintrain_woe32/advtrains_interlocking.ls'
failed, trying .new fallback:
2021-03-04 14:27:04: WARNING[Server]: [serialize_lib] (?):
....minetest/mods/advtrains/serialize_lib/serialize.lua:20' instead!d
header, got 'LUA_SER v=1
2021-03-04 14:27:04: ERROR[Server]: [serialize_lib] (?): Unable to
load data from '/home/debian/.minetest/worlds/mintrain_woe32/advtrains_interlocking.ls.new':
2021-03-04 14:27:04: ERROR[Server]: [serialize_lib] (?):
/home/debian/.minetest/worlds/mintrain_woe32/advtrains_interlocking.ls.new:
No such file or directory
2021-03-04 14:27:04: ERROR[Server]: [serialize_lib] (?): Note: This
message is normal when the mod is loaded the first time on this world.
2021-03-04 14:27:04: WARNING[Server]: [serialize_lib] (?): Loading
data from file '/home/debian/.minetest/worlds/mintrain_woe32/advtrains_lines.ls'
failed, trying .new fallback:
2021-03-04 14:27:04: WARNING[Server]: [serialize_lib] (?):
....minetest/mods/advtrains/serialize_lib/serialize.lua:20' instead!d
header, got 'LUA_SER v=1
2021-03-04 14:27:04: ERROR[Server]: [serialize_lib] (?): Unable to
load data from '/home/debian/.minetest/worlds/mintrain_woe32/advtrains_lines.ls.new':
2021-03-04 14:27:04: ERROR[Server]: [serialize_lib] (?):
/home/debian/.minetest/worlds/mintrain_woe32/advtrains_lines.ls.new:
No such file or directory
2021-03-04 14:27:04: ERROR[Server]: [serialize_lib] (?): Note: This
message is normal when the mod is loaded the first time on this world.
2021-03-04 14:27:04: WARNING[Server]: [serialize_lib] (?): Loading
data from file '/home/debian/.minetest/worlds/mintrain_woe32/advtrains_atlatc.ls'
failed, trying .new fallback:
2021-03-04 14:27:04: WARNING[Server]: [serialize_lib] (?):
....minetest/mods/advtrains/serialize_lib/serialize.lua:20' instead!d
header, got 'LUA_SER v=1
2021-03-04 14:27:04: ERROR[Server]: [serialize_lib] (?): Unable to
load data from '/home/debian/.minetest/worlds/mintrain_woe32/advtrains_atlatc.ls.new':
2021-03-04 14:27:04: ERROR[Server]: [serialize_lib] (?):
/home/debian/.minetest/worlds/mintrain_woe32/advtrains_atlatc.ls.new:
No such file or directory
2021-03-04 14:27:04: ERROR[Server]: [serialize_lib] (?): Note: This
message is normal when the mod is loaded the first time on this world.
2021-03-04 14:27:04: ACTION[Server]: [advtrains][load_all]Loaded
advtrains save files
2021-03-04 14:27:04: ACTION[Server]: [advtrains][lines][rwt]
Initialized railway time: 00;00
.__ __ __
_____ |__| ____ _____/ |_ ____ _______/ |_
/ \| |/ \_/ __ \ __\/ __ \ / ___/\ __\
| Y Y \ | | \ ___/| | \ ___/ \___ \ | |
|__|_| /__|___| /\___ >__| \___ >____ > |__|
\/ \/ \/ \/ \/
2021-03-04 14:27:04: ACTION[Main]: World at
[/home/debian/.minetest/worlds/mintrain_woe32]
2021-03-04 14:27:04: ACTION[Main]: Server for gameid="minimal"
listening on 0.0.0.0:50615.
Unified Inventory. inventory size: 140
2021-03-04 14:27:05: WARNING[Main]: Irrlicht: Could not open file of
texture: rail.png
2021-03-04 14:27:05: WARNING[Main]: Irrlicht: Could not open file of
texture: rail.png
2021-03-04 14:27:05: WARNING[Main]: Irrlicht: Could not open file of
texture: rail.png
2021-03-04 14:27:05: WARNING[Main]: Irrlicht: Could not open file of
texture: rail.png
2021-03-04 14:27:05: ACTION[Server]: singleplayer [127.0.0.1] joins
game. List of players: singleplayer
2021-03-04 14:27:05: ACTION[Main]: [cs_waypoints] CSM loading...
2021-03-04 14:27:09: ACTION[Main]: Server: Shutting down
2021-03-04 14:27:09: ACTION[Main]: [advtrains]Saved advtrains save
files, took 1 ms
Hi,
thanks for pointing this issue out. I changed all file openers to binary mode. file:read("*line") is still able to handle line-wise reading, so everything works.
A possible error cause for the weird behavior you mentioned is that apparently Windows MT client adds CR characters to multiline text fields:
http://www.bleipb.de/filedrop/carriage_return.png
Probably, you produced a CRLF in the text file somewhere this way and this confused file:read("*line").
As a countermeasure, now the CR character is also escaped.
I've tested the fix on my Win7 instance using the LinuxForks saves and a local world. Do you mind testing it on your machine again to see if it's really fixed?
orwell
Hi orwell,
Thanks for the quick turnaround.
Firstly, it's important to mention that the lua_ser files that have
been saved under windows until this point have ALL lines written as
CRLF - it's what happens when the file is not opened in "b" for
"binary" mode. Yes, this includes the nodedb. Files opened such are in
"text" mode, and thus "LF" meaning newline is automatically translated
to "CRLF" (hex: 0D 0A) under Windows; it's like a "do what I mean, not
what I say" kind of feature; maybe Mac OS 9 even wrote CRs instead of
LFs [citation sorely needed]. Here's the header of advtrains_core.ls
in a hex dump of my old world:
Offset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000 4C 55 41 5F 53 45 52 20 76 3D 31 0D 0A 53 61 74 LUA_SER v=1..Sat
00000010 63 3A 54 0D 0A 53 63 6F 6E 74 72 6F 6C 6C 65 72 c:T..Scontroller
00000020 73 3A 54 0D 0A s:T..
The new version of serialize_lib works well as verified by playing
minetest with my usual interlocking 3-stop setup and also can work
with my development branch of netmapper. Carriage returns pasted into
a lua automation ops panel are also saved, verified by looking with a
hex editor at the save files.
Unfortunately, this new version doesn't provide a proper migration
path from serialize_lib version 1 to serialize_lib version 2 files for
windows hosts. I had an interlocking setup on 3dc6426e that, when
upgraded to 5ee8c0e39, loses all of the save data again; after
re-doing the interlocking setup the changes stick properly though.
Here's a hex dump of the same header on the same world after I re-did
the interlocking:
Offset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000 4C 55 41 5F 53 45 52 20 76 3D 32 0A 53 61 74 63 LUA_SER v=2.Satc
00000010 3A 54 0A 53 63 6F 6E 74 72 6F 6C 6C 65 72 :T.Scontroller
It's not good enough to say to windows users 'do your stuff again',
because people might have quite a bit of interlocking saved and we all
know it's a long process to set it up again.
So I'm working on writing a proper migration process that will take
lua_ser files from version 1 to version 2 by replacing all CRLF to
just LF, like dos2unix. It has the minor side effect that any carriage
returns inside atlatc ops panels or environment code will get wiped,
but that's a smaller price to pay and much harder to avoid anyway. I
hope anybody editing such code in an external editor isn't using
Notepad.
Do issues still appear since e5060860?
(except with the nodedb, see https://forum.minetest.net/viewtopic.php?p=394878#p394878)
If not, I think we can close this. Future breakages won't be severe as strict loading is now the default.
No further problem reports. Closing.