A Fork in the Road

The Status Quo

I started developing my SNES emulator, bsnes, nearly six years ago. When I first started on bsnes, my goal was absolute hardware accuracy. But of course, starting from a blank slate, you cannot go straight to perfection right away.

I have also become painfully aware of the realities of accurate emulation, and why it has not been done before. The system requirements are extremely intense. And when you want to keep code readable on top of that, all bets are off. The failure of the layman to understand this has also been a source of personal torment that I grow increasingly weary of.

Recently, I've completed the last vital component to a virtually perfect SNES emulator: a cycle-based video renderer. This is in stark contrast to every other SNES emulator ever made, which renders entire scanlines at a time. By comparison, a cycle-based video renderer requires exactly 684 times the synchronization work of a scanline-based video renderer.

As you can imagine, the increase in system requirements is horrifying. Compared to the scanline-based versions of bsnes, the cycle-based version is a full three times slower (not 684 times slower, because synchronization is only part of the overall picture, plus there are other processors involved that also consume time.) Meaning that not even top-end processors are capable of maintaining playable framerates.

Throughout the years, I have always strived with bsnes to attain maximum accuracy, but also to keep the system requirements reasonable. Now, reasonable to me is clearly not reasonable to everyone else. I believe that requiring a processor less than seven years old is reasonable, while others will disagree. Especially those who only own even older processors. But I digress.

There have always been compromises in bsnes' design for the sake of speed. These compromises came in two forms. The first was in sacrificing code clarity. One example of this is that I have implemented the S-DSP processor using enslavement. This tends to be significantly faster than true cooperative multithreading, although the resultant code is a good bit sloppier, and less consistent with the rest of bsnes' core design.

The second compromise was a matter of the more-accurate code simply not existing yet. Indeed, when bsnes first started off, absolutely nothing was cycle-based. The CPU and SMP executed entire opcodes at a time, the DSP generated entire samples at a time, and the PPU generated entire scanlines at a time. As time passed, and my understanding of the SNES hardware increased, I have gone back and rewritten each component to be more accurate.

Up until now, I've always immediately utilized these more accurate cores, each time taking significant speed hits in doing so. But nothing has been quite as demanding as this new video renderer. I knew this going in as well, this is the reason I put off writing a cycle-based video renderer for so many years, and saved it for the absolute last step.

I know very well that an emulator that cannot attain full framerate on one's computer will not be used. And without any users, bugs will go unnoticed, and accuracy will not be improved. So in a way, bsnes has always been a sort of balancing act. Accuracy and clean code on one hand, and just enough performance and trickery for playable frame rates on the other hand.

Overall, this has been a pragmatic approach; but in reality it ends up with the worst of both worlds. The perfection I want cannot be obtained with compromises, and the requirements are too high for any kind of larger-scale user acceptance. I can't advance the overall state of SNES emulation if I have very few people actually using bsnes. I had hoped that other emulators would improve over time, but that really isn't happening. Aside from a single minor update to Snes9X, no emulator has made any public progress at all in the last four years now.

Clearly, I am at a crossroad. Something needs to change.

Code Design

bsnes has always had the capability of having multiple cores, via base classes for each core processor. This is how I have over the years went back and created newer, more accurate cores for each processor, moving from higher level concepts to cycle-level cores.

But even this abstraction had a cost to code clarity. It's one extra layer that really served no purpose other than allowing less accurate cores the chance to live on a little longer.

It also has a terrible side effect. The cores are not dynamically swappable, because that would destroy any chance of inlining, resulting in a very significant slowdown. As a result, the only way to have two different cores is to release two different builds. This is why v065 came with bsnes and bsnes-accurate. The former used the scanline-based video core, and thelatter used the cycle-based core.

This just creates needless confusion.

Code Fork

My solution to all of the above has been to fork the bsnes codebase. Now, fork in this sense is too strong a word. Most of the code in bsnes is not resource demanding, or can be directly shared. Code such as the user interface, special chip emulation, cartridge loading, cheat code support, memory mapping, and the underlying system architecture.

So from this point on, the accurate version of bsnes has been split to a new project, named asnes. This project will represent absolute, unrelenting accuracy, no matter the cost to performance.

bsnes is now a superset of asnes. Using some compile-time magic, I simply substitute in speed-optimized alternatives to any module from asnes. For now, this would be the scanline-based PPU, and a state machine-based DSP. In the future, this list will grow.

Going Forward

So now that I have asnes, where I can focus all my efforts to accuracy on, there is no longer such a rigid demand to attempt to keep the bsnes side of things as readable. Since bsnes is no longer needed as a self-documenting code base, I can utilize more complex methods of increasing speed. Things like range-based IRQs, audio processor enslavement, video processor enslavement, more advanced caching methods, and so on and so forth.

I do not see any point in a race to the bottom, so my goal here is not to turn bsnes into a ZSNES or Snes9X clone. Those already exist and fill their niche just fine. I am instead envisioning something like a Nestopia. Very high compatibility with decent overall performance.

If you consider that bsnes has always compromised on the video rendering with its scanline renderer, it would dictate a certain level of diminishing returns where accuracy gained is not worth the performance lost. Realizing that a scanline renderer does not render any commercially released software unplayable, I believe we can apply a similar model to the other processors. Scale back on the esoteric hardware features that are not used by any commercial software to improve performance; but do everything possible to keep game-specific hacks out of the core, and keep compatibility as close to 100% as is possible. This is far more about code optimizations than it is about accuracy sacrifices.

Of course, this isn't going to happen overnight. It's a long-term strategy. And just as bsnes obtains higher performance, asnes will obtain higher accuracy.

The two emulators now represent a yin and yang to my overall philosophy of emulator design.

Reasoning Behind Naming Conventions

There's a couple of reasons why I chose to rename the accurate version, rather than the speed-oriented version. The most superficial is simply that asnes has a logical appeal as "accurate SNES", with bsnes directly appearing as one grade lower in terms of overall accuracy.

But the more important reason is that I've always wanted bsnes to be an emulator that people actually use. Regardless of how fast computers get, there's very little reason anyone would ever want to use asnes as a general purpose SNES emulator for gaming. It is simply a hardware preservation project.

Lastly, there is the fact that asnes is my first chance to really start off with a 100% full delivery of my initial promise or claim. asnes, right from the very start, represents near perfection. Whereas bsnes has always been a work in progress. That's no different even with this fork. bsnes has a long way to go before it can claim to be a high performance emulator. Given that no official bsnes release has used the cycle-based video renderer, this naming convention keeps bsnes right on track with where it was, representing the least amount of change to end users.

I felt that attempting to have two emulators, continually drifting apart yet using the exact same bsnes name, would just cause too much confusion.

© 2010 byuu