View Single Post
Old 02-16-2023, 12:03 PM   #1
Trumpet63
Mostly Ignored
Skill Rating Designer
Retired StaffFFR Veteran
 
Trumpet63's Avatar
 
Join Date: Mar 2011
Posts: 471
Default I made an auto-difficulty model

Hi. You may remember me from past hits such as:

Skill rating: (Feb 2015)
https://www.flashflashrevolution.com...d.php?t=140927

An AI model applied to FFRMania difficulties: (Jan 2018)
https://www.flashflashrevolution.com....php?p=4606494


Well, I’ve learned a lot since then. In 2020 I landed my first job as a data scientist. In 2021 I was given access to the FFR database and made a very basic statistical model that achieved a mean squared error of 35.9 on a set of 2684 FFR songs. It was nothing to write home about, but very few people had access to the FFR database, and fewer still had the skill or desire to create an auto-difficulty model. After I announced this model internally, it increased interest in the problem and led to the creation of the #difficulty-system channel in the FFR Discord server.

I had, however, solved the problem to my satisfaction at the time. I had a serviceable difficulty model, which is what I needed to continue development of my rhythm game, so that’s what I focused on, until around December of 2022, a few months ago. I had quit my data science job, and while looking for greener pastures I had a lot of time to think about rhythm games. I want to be the best at understanding player skill, and that is supposed to be the selling point underlying my rhythm game project. After solving a lot of more pressing issues in my game, I eventually came to this problem again. My understanding of skill had basically stayed the same since my last model, and I knew I could do so much more with my data science experience, so I got to work.

Around the same time I started working on the problem again, WirryWoo, who’s also a data scientist, consulted with me about an auto-difficulty model he was working on. It seemed interesting, if slightly over my head, so I offered as much advice as I could and generally just kept to myself. Then suddenly I made a breakthrough: 30.7 mean squared error using a machine learning technique I had just learned. This was insane. Not an incremental change, but a 14% jump in accuracy. At this point I was almost sure the problem was solved. I had used the technique, the one that everyone used, XGBoost, the most cutting edge AI model. Then, WirryWoo told me his model had an MSE of 24.4.

Smash cut to today. 60 model prototypes later, I have achieved an MSE of 11.0. I've been delaying this announcement for a while since so many improvements have been happening so quickly, and even now there’s every chance that this model will be blown away in a few weeks, but at least I’m pretty close to being out of new ideas.

Also, apologies to anyone else working on auto-difficulty that I’ve left out of this story. I wanted to keep it short and my memory of this isn’t that good.

So anyways

Here’s a spreadsheet of the model’s predictions for all the FFR songs, sorted so the worst predictions are on top:
https://docs.google.com/spreadsheets...it?usp=sharing

Here are some explanatory images:










In closing,
I'll say that I don't really expect FFR to switch over to auto-difficulty overnight. FFR has a long history of assigning difficulties manually, and as such the standards are very high, perhaps too high for any model to achieve. I personally would like to see an auto-difficulty model gradually integrated into the stepfile acceptance workflow. I think there is an appetite for it, especially among veteran difficulty consultants who are feeling a little burnt out after spending years debating with people about the now 3000+ songs on FFR. Still, switching to auto-difficulty seems like a tradeoff at best, and it's not yet clear what we'd be trading off for, and it's not really for me to say. It will take someone with authority and imagination to envision what FFR could become with improvements to automation. The best I can do is facilitate that process. I have uploaded the model to an API, and I will make it available to any staff members, for FFR internal use only.
__________________
2014 October 7th 1:03 AM

Zageron: Trumpet
Trumpet63: yes, im here
Zageron: You have a problem.
Trumpet63 is offline   Reply With Quote