Most Chess Databases use their own format for storing information about chess games and the games themselves. If you want to query or extract any data out of the database package – your best bet is to use pgn files, although recently Aquarium also added ability to export data via scripts. In either case – performance is likely to be an issue as text files can get pretty large. When I learned that Jose Chess uses MySQL, I was curious to find out how to connect to its database so that I can search it with my own queries. Eventually, I was able to get it to work, and run a query like this against its large database which comes with Jose Chess and contains about 1.5 million games
This query shows the opening that scores the best for White – A94 - who would have guessed! The most popular index is B22 – 2. c3 Sicilian. There is obviously now opportunity for running more metrics on players, openings, openings trends, etc.
Getting all the data extracted out of Jose Chess obviously took a bit of reverse engineering, but basically I had to:
- Find the path where Jose Chess and its embedded MySQL instance is storing the database. On Windows 7 it is - C:\Users\<UserName>\AppData\Local\VirtualStore\Program Files (x86)\jose\database\mysql\jose
- Install MySQL and MySQL workbench to have a standalone instance of MySQL
- Create a new database (schema) in MySQL workbench, and in that DB – tables to match the Jose Chess tables
- Drop the Jose files into my wherever my standalone instance of MySQL is storing its files - C:\ProgramData\MySQL\MySQL Server 5.5\data\jose_huge_db
- Restart standalone instance of MySQL
- Run the query on the screenshot above!
Here are the files that need to be dropped around:
The only (but significant) disappointment is that Jose Chess is still storing game text (moves) in its internal binary format, so I can’t easily get to that data …