Flash Flash Revolution - View Single Post - Auto chart generator achieves 60% predicting accuracy on existing challenging+ charts

MinaciousGrace · 03-28-2017, 04:14 PM

Quote:

Originally Posted by DaBackpack

Presumably, there are a set of "rules" and "heuristics" for good tracks.

I don't want to get too far into this discussion- in part because I don't know much about pad charting. From what I understand pad charting has historically been more or less linear. If there is a prominent sound, there is a note. Want a harder chart? Step more sounds. Beyond this the major considerations are how the patterns choreograph the movement of the player during the chart.

The playing metagame enforces the charting meta and vice versa. I've heard recently that more what kb players would consider to be "technical" charts are being produced and played. If you don't know what I mean by that then I guess that sort of underscores one of the major disconnects between pad and kb charting.

Anyway the point is the decision making process in creating pad files tends to be less focused on a correlation between the music and the placed notes rather than a correlation between the placed notes and the playability thereof. So the machine learning process isn't learning how to express music. It's learning how to identify prominent peaks in waveforms and use them to create pad-playable patterns within a very restrictive scope.

Keyboard charting has its own set of concerns governing "playability", however as a general proposition the variety of patterns and types of patterns that are playable outstrips pad by a huge margin. Combine this with the fact that keyboard players are- by pure physical nature- able to hit more notes and you end up with a situation in which a huge number of decisions must be made concerning factors that pad charting doesn't even consider. It's easier to explain with an example.

In kb charting layering is the process by which multiple different sounds that fall on the same note are represented through chords. It is the driving force behind "jumpstream". Pitch relevancy (pr for short) is a generally observed guideline that as notes/tones ascend in pitch there should be a matching "rise" of the notes across the columns, so, 1234. Descending pitch should be represented as 4321. However we can't do this infinitely because we only have four columns. Suppose we have 8 notes placing a descending scale. We can wrap it around using 43214321. Large differentials in pitch can be represented by traversing more columns with each successive note. Instead of 4321 we might have 4213. You get the idea.

Now suppose we have a song with snares and bass hits alternating on 8ths with a 16th melody. We'll have double notes on 4ths and 8ths and all 16ths will be single notes. Snares tend to have higher pitches than bass hits so when layering onto the melody we'll prefer to place the associated note further to the right for snares and further to the left for the bass hits. Following pr in placing layered jumps doesn't exactly always follow this rule, though it often can and will. If you have a high note in the melody accompanied with a snare hit you'll probably place down a [34], and if you have a bass, a [14]. In this each note in the chord is placed according to its individual pitch relevancy respective to the previous 16th note, but you may also place chords according to the pitch relevancy of the chord relative to the last chord placed on the 8th.

Now, there's another problem, and this is where keyboard playability comes into play. There are very few melodies with accompanying percussion that will produce 100% strict pr adherence that do not form 16th jacks (44 for example). You can get this if you have an ascending melody on which you have a snare on an 8th, for example, 12[34]4. The ascending melody follows pr rules and the snare placement also follows pr rules, but following both rules results in something awkward and frequently avoided (but not always). Suppose we add a new rule that dictates we don't form 16th jacks within vanilla jumpstream. Moving the note to 1 would be the most compliant with pr.

Suppose the next note is a large jump in pitch, one that could easily justify moving three rows to the left. Now we have 12[34]14 and the issue is it plays the same as if the next note in the sequence were a reversal in the pitch change and slightly lower in pitch than the previous note. Strict adherence to PR would dictate we cross back across the columns and place the note on 4 as well. This would produce the same pattern despite two radically different melodies. Some artists would strictly adhere to pr, while others would forego pr rules and emphasize the pitch change by using 12[34]13, or a two column jump.

This is a stylistic distinction (a trait of the specific stepartist in question) that produces two different patterns that play differently and call attention to different aspects of the music. Is it more important to call the player's to the major pitch change? Is it more important to remain faithful to pr? If we call attention to the major pitch change will we detract from a similar technique used to highlight something else more interesting in the music? Which way will it play better? Can we teach a computer to appreciate subtle nuances in note placement based on what the stepartist intends to draw the player's attention to? Can we teach a computer how to make decisions on how to express what it hears (sees) in the music?

Is it worth asking these questions before we can programmatically determine the criterion upon which these decisions must be made? It needs to know what is a bass hit, what is a snare hit, how much the pitch in the melody changes. What if the percussion also varies in pitch, does it heed pr in the interest of staying faithful, does it ignore it in favor of a different set of rules that produces something more fun?. Can it decide whether to pr chords based on previous chords or individual notes within the chords?

Suppose intense vocals obscure parts of the melody or less affirmatively hit drums. Can it identify obscured bass drum hits? Obscured melody notes? If it can't, can we teach it to interpolate and assume there must be one there, and place a note anyway? Do we change the entire schema of the section to follow the vocals and not the melody and percussion? Can we teach a computer to do that?

Supposing we could use machine learning to teach a program to do all these things. Can we teach it to be consistent in its decision making across an entire chart? Can we teach it to make exceptions to its consistency? To make structured variations, and then to deviate from them? That's a lot of what music is, really. Deviation of variation. People have been trying to get computers to compose music for as long as computers have existed. Obviously they didn't have the benefit of machine learning and the modern computing power we have today, but we still haven't succeeded. Anyone familiar with music theory knows how formulaic, how mathematical it is. You can try to reduce it to saying it's just a bunch of rules governing what instruments get played at what points in time. We can all look at generic pop songs and think to ourselves that it could be spit out of a program, but we haven't been able to do that, because it's more than that- or at least, if you disagree with that sentiment, it's more rules than we've been able to handle thus far.

Most of what I've discussed concerns a singular example of a singular rule in keyboard charting which is heeded, partially heeded, or flat out ignored depending on the context of the chart and how it has progressed during construction, the charting style of the author, the difficulty and playability of the patterns it produces and or whether or not the stepartist just chooses to apply variation in charting technique within the same file, or variation of what specific sounds they are charting. This is all after the stepartist has made decisions on how they wish to stylistically approach a piece of music to begin with. If music is an expression, a chart is an expression of an interpretation of that expression. This is all before the stepartist has a chance to review their work, to playtest it, to make adjustments based on feedback. To gauge whether or not that feedback is good.

I'm not saying it's impossible to auto generate keyboard charts that could pass as human generated (or if that bar is too high, simply "worthy of being played"), and in fact I think that it is a possible eventuality, but the decision making tree involved in technical keyboard charting exponentially dwarfs that of pad charting (insofar as I understand the latter, in the context of keyboard dump files there is no debate). To equate the two is, in my opinion, naive. If I'm wrong and it is fair to equate the two, then the project is going to shortly run into insurmountable walls and observe logarithmic gains in accuracy over time as it has to learn and interleave an ever expanding set of rules and exceptions.

It may be within the narrow context of generating levels in a video game (interesting side thought, do you think it possible for a program to use machine learning, absorb every platformer game ever produced, and then use that information to auto generate levels for humans to play? For an entire platformer to have 100% autogenerated levels?) to go with music but you're still boiling down to the fundamental problem of "how do we get computers to think like humans?".