How to #sisileaks

How to #sisileaks

Or how I learned to stop playing and love the downtime

New patches on test servers often introduce new items and whatnot, and often those changes are not found easily. Curiosity and a good portion of magic crystal balls lead me to take a closer look at what’s under the hood of eve for quite a while now. New items (typeIDs) are just a fraction of what is waiting to be discovered. But that’s where most of the excitement originates (next to art assets). The way I achieve the discovery of new material eveolved over time. Sometimes others ask me where (and how) they can discover new stuff. If you also wonder, I’ll describe some of the steps currently follow (though my procedures differ a bit).

Patch?

eve CREST https://api-chaos.testeveonline.com/ provides the current version of the corresponding server

leaks_crest--serverversion

The other servers use:
https://api-sisi.testeveonline.com/https://api-duality.testeveonline.com/ and of course https://crest-tq.eveonline.com/

I will continue with chaos. Parse the string at root -> serverVersion. This is 1051180 in this example. With this number as part of the file name, load

https://binaries.eveonline.com/eveonline_1051180.txt

Within that CSV file you will find details on files that are not part of the DoD resource cache, but instead are used in the client installation itself  (notice the app:// which stands for the root of the client’s folder. The binaryindex, sort of. The entry resfileindex.txt (nearly at the end of the file) is what we need.

binaries_resfindexThe launcher does this automatically and saves all files from the index to C:\ProgramData\CCP\EVE\SharedCache\[Server]\

Right after the comma is the (path and hashed) file name of the resfileindex.txt that is valid for the current version of the client, another CSV file.
Server version 1051180 uses 1d/1d34143a37d4b739_8a72298522ae73e3e2fd81324162e1ee as resfileindex.txtNote. Download that one from the same binaries.eveonline.com server

https://binaries.eveonline.com/1d/1d34143a37d4b739_8a72298522ae73e3e2fd81324162e1ee 

You will probably want to change that name to something else, like resfileindex_1051180.csv. In here you will find details of all resource files used by the client. But these are stored in the SharedCache folder and not within the client directory. Files in here that are the same for other servers are shared by all of them. But this time I only care for item information, to find if there are new items on the (test) server.

Staticdata

Item categories and groups, as well as items themselves are found within the staticdata portion of the shared cache:

resfiles_staticdata

You can just change their extension from static to sqlite, because that’s what they are – SQLite3 databases. The data, though, is within a single column of that database, for each of the three databases:

sqlite-evetypes-value

The key column contains only the ID, which is
categoryID for evecategories.static
groupID for evegroups.static
typeID for evetypes.static

If you only care for IDs that were not present in previous versions, the key column is certainly useful. The time column contains the unix timestamp of (of the last change done to?) that row. We need the content of the value column

sqlite-evetypes-data

The details of each ID (category, group, type) are saved to that field, in a JSON array with key/value pairs: The order and presence of values is not the same for each ID, so make sure you cover all.

types-json
This is what the array for Spodumain looks like.

No strings attached! typeName and description are missing and instead you’ll find typeNameID and descriptionID. categories and groups also only contain nameIDs (and descriptionIDs), that’s why we have to look for the actual strings in some other file(s).

Strings and text

The approach I describe below can can be done easier, more effectively and overall better with a python script that works with the .pickle files. If you can/want to choose that option, skip the PHP/N++ way.
With Python you can also handle a lot more than just the strings. In that case, you should definalery take a look at Entity‘ rEVErence!

Names, which includes names for categories, groups of course items as well as item descriptions (and more than just that), reside within the resource “folder” localizationfsd:

resfiles_localization

localization_fsd_en_us.pickle gets us the above text and also text used in various UI elements. localization_fsd_main.pickle contains the link between the two.

All of these localization_fsd_*.pickle files are ANSI encoded and contain plain ASCII with non ASCII characters escaped. This is also the case for line feeds (LF, ASCII 10, 0x0a).

typename-pickle
Within localization_fsd_en_us.pickle the typeNameID 67725 is tied to Spodumain.

Just convert that file to a table which holds textID <=> text, removing the noise that surrounds itNote. To achieve that, follow the steps below in a PHP script or with Notepad++ / any editor that can find and replace with regular expressions:

N++ extended replace PHP
$input=  utf8_encode($source);
\r\\u000a to <br> $input=  str_replace( chr(13)."\u000a", "
",
             $source );
\\u000a to <br> $input=  str_replace( "\u000a", "<br>", $input);
to $input=  str_replace( "\"", "'", $input);
\t to ‚ ‚ $input=  str_replace( chr(9), "' '", $input);
\nsI to #,# $input=  str_replace( chr(10)."sI", "#,#", $input);
\n(V to |,| $input=  str_replace( chr(10)."(V", "|,|", $input);
See below. str_replace( "(dp2".chr(10)."I1".chr(10),"\nNNtp1#,#1\n", $input );
N++ regex replace
(?!^.*#,#.*$)^.+ to blank $input=  preg_replace( "/^((?!#,#).)*$/m", "", $input);
^.*?tp.*?#,#.*? to blank $input=  preg_replace( "/^.*?tp.*?#,#.*?/m", "", $input);
N++ Regex replace (\n matches newline)
[\n]{2,} to \n $input=  preg_replace( "/[\n]{2,}/", "\n", $input);
N++ extended replace again
|,| to ,“ $input=  str_replace( "|,|", ",\"", $input);
\n to „\n $input=  str_replace( "\n", "\"\n", $input);
Look below for additional steps required if you choose to not use a script. $input=  preg_replace_callback(
'/\\\\u([0-9a-fA-F]{4})/',
function ($match) {
      return mb_convert_encoding(
pack('H*', $match[1]),
'UTF-8', 'UCS-2BE');
},
$input );
$input= str_replace( "\\u0394", "Δ", $input);
 file_put_contents($oufile, $input);

The resulting pile of text will strip text ID 1 in the beginning of the file. If you widh to keep that entry intact, add 1,“Passive“ (for en-us).

N++ extra steps

typename-csv
The result already looks better

Characters beyond the ASCII range are still encoded and require another step (even more so in the ru, jp and zh localization files!). To handle that, copy all of the text and use a converter or your own sript of choice.

escape-utf8
A porton of the localization_fsd_ru better shows the result (descriptionID 287972)

One character is missed, \u0394 should turn into Δ – if you absolutely want to convert all, even formulas.
Paste the result into a (new) text file which is UTF-8 encoded!

utf8-output

Once you are finished with creating tables out of the localization data, it’s up to you how you proceed with the process to get categories, groups and items paired with their correct names and descriptions. As long as you refrain from using a spreadsheet like in the below example:

matchmaking
Though that worked for me when I started peeking …

Done The Javascript fails to convert \u0394 to Δ, though. Anyway, that was easy!

And what about the other changed DoD resources. More on that maybe later.

 

The hash before the underscore “_” is the same for all different versions of that binary/resource data file, while the part after the underscore is unique for that distinct version.

I’m absolutely sure that there are many better ways to handle that, but of this specific need it suits well.

General note: There are many steps that scream for automation, and I already use some of them, like conversion from SQLite to json. The above procedure shows what needs to be done without taking care of efficiency.

2 Gedanken zu „How to #sisileaks

    1. Meanwhile, that’s what I did and the use of python – or rather rEVErence allowed me to not just better handle the localization but also many other data from chaos, sisi, tq (and occasionally duality).

      I didn’t try the .py advance before, as my lack of experience with coding .anyLanguage are still lacking and my assumption was that I wouldn’t be able to create any working results. Until recently.

      Your comment (that took me quite a while to approve, sorry) and others on #devfleet slack finally made me give it a try.

Kommentare sind geschlossen.