View Single Post
Old 08-25-2019, 01:04 PM   #13
TC_Halogen
Rhythm game specialist.
Retired StaffFFR Simfile AuthorFFR Music ProducerD8 Godly KeysmasherFFR Veteran
 
TC_Halogen's Avatar
 
Join Date: Feb 2008
Location: Bel Air, Maryland
Age: 32
Posts: 19,376
Send a message via AIM to TC_Halogen Send a message via Skype™ to TC_Halogen
Default Re: Scoring Data in Brutal Difficulty Range

The solution to this problem is going to be whether or not we inevitably shift away FFR's mindset of rating based off of AAA difficulty and take more physical capability into account. Other rhythm games don't arbitrarily skyrocket entirely playable charts simply because of having concepts that make their chart tougher to perfect. Songs like Crowdpleaser continue to be treated as an anomaly in any case, though. FFR obviously cannot rely on passability for the purpose of difficulty scaling considering that the game mechanics can promote endless mashing to stay alive - though that's not really impactful for the leaderboard system.

On top of it, we're contending with the game's process of rendering charts where songs that are LOWER in effective tempo can be harder than songs that are higher in tempo due to overall frame distribution (see: Almost There versus Magical 8-bit Tour).

Wanderflux and Gamma both feel substantially harder than RATO does to me, and yet they fall lower down on the difficulty scale simply because they don't theoretically have parts that are harder to AAA compared to RATO. My average score on RATO will be better than Wanderflux/Gamma nearly 100% of the time because the physical requirement to play the chart is substantially lower on account of all of the difficulty being contained within jumptrillable walls. Yet, RATO sits at 108 because of this. Naturally, not enough time has passed to make a claim about overall scoreboard spread between these two songs, but I'd be willing to wager that a good majority of players will have a similar relationship between these two songs.

On the flip side: FFR values difficulty spikes way too much due to the skill rating algorithm placing an exceptional weight on scores that are near or at the AAA-level. Two quick examples:

- Husigi should not be a 101, it should be a hard-to-AAA-(number lower than 100) because of the large wall and short [14] jack sequence. Aside from those two bits that very literally account for less than 5 seconds of the total length, the chart does not have much that puts it beyond the low-mid 90s. Full-run single digit good counts (not necessarily effective SDGs) can be found all the way down to rank 70 for a song whose difficulty level actually eclipses the entry requirement for FFR's D8 in this current official.

- Winter Wind Etude doesn't have quite as deep of a score spread for a 101, but the chart has virtually no physical boundaries compared to anything that's 90+; the chart is rated as high as it is because of the brief jack/polys and the ending (which isn't all that difficult even for 100 standards). Numerous D7-level competing OT players (even some that might not have been thought of as front-runners) have extremely good scores on this and might be a tell-tale sign that a rating drop might be required.

I won't go through the entire list of files with explanations behind everything, but here are my suggestions (keeping in mind, which come from a bit of a different angle):

- System Doctor: 101 -> 98*
- Husigi Usagi Milk Tei: 101 -> 99*
- White Walls Part 2: 102 -> 100*
- Winter Wind Etude: 101 -> 100
- Violent Arcade: 103 -> 102*
- RATO: 108 -> 106*

[...etc]

The disparity between Revo and every other 102 is absolutely obnoxious, by the way...

* reiterating that this is a change reflective physical capability, not necessarily difficulty to AAA. Changes like this absolutely would not work with our current skill rating algorithm and would likely cause some annoyances.


With respect to our current system and putting aside the concept of physicality, though - my thoughts:
- Snafu: 100 -> 99
- Powerflux: 103 -> 104
- Vegas/OWA-Skeletor: 100 -> 101
- Magical 8bit Tour: 99 -> 100 (frameslol)

...with respect to not using that to guide the discussion:
Quote:
The skill rating system approximates the highest level you're capable of AAA'ing, and when a player's personal rating is higher than any AAA they've ever achieved, something's up. Current scaling awards a higher equivalency for 2-0-0-1 on a file rated x+1 in difficulty compared to a AAA on a file rated x, and I feel many players would agree the AAA in this instance is likely more impressive, assuming the charts are both rated appropriately.
It has become increasingly clear that there needs to be some sort of logrithmic scaling of AAA equivalency where the higher the difficulty gets, the more lenience it should take for a score to be considered "as impressive" as you mentioned. This in turn points back to the difficulty scale, though: how do we use a mathematical equivalence for comparing the results of a song at x difficulty to a AAA at x-y difficulty when certain disparities clearly exist within a point of difficulty? A length factor might be needed on top of the score for AAA equivalence determinance (as in: slow the exponential decay rate as the length of the song grows).

Take where is my balls (100) and A Dichroic Glass Snafu (100) -- both released within a week of each other. The difference in score spread is pretty quickly apparent. Nearly the same number of players, and very close number of times played (within a few hundred), yet the score spread clearly favors the shorter song. That might need to be another point to look into for developing a new skill-rating algorithm.

Last edited by TC_Halogen; 08-25-2019 at 04:41 PM..
TC_Halogen is offline   Reply With Quote