Thursday, July 23, 2015

The Process of Ripping a Dreamcast Game

A while ago, I blogged a little intro to ripping Dreamcast games (I will continue that). I wanted to go over what my personal process looks like, and what work is involved. Lately I've been working on ripping Sonic Adventure 2, a beast of a game. There's a lot of data, and fitting it to a 700mb CD-R is a challenge. So I'm going to go over what it all looks like, from start to finish.

1. Picking a Game
This may seem easy, just a matter of "what would I like to play?" but that's not always the case. The Dreamcast was never around long enough for there to be an official scene ruleset developed for the system. Yes, piracy has rules and they are very strict. If you're in the scene, violating the rules can get your releases nuked and you can lose access to sources. Groups in the scene have to ratify rules. Sometimes groups will disagree with them to the point that they'll abstain from signing and participate as outsiders.

Well, the Dreamcast never got that. While most consoles have a near complete set of scene rips, the Dreamcast didn't get that. Echelon was the most active group back in the day, ending up with about 250 releases or so. There were other groups, but even with all of them they didn't cover all of the Dreamcast games.

I have the means to dump GD-ROMs myself, but it's largely irrelevant honestly. TOSEC's GDI set is verified, and if I did dump my own GD-ROM, I'd check it against TOSEC's hashes to verify that my dump was good anyways. So I've got a 500gb TOSEC set to pull GD-ROM dumps from.

So, why is picking a game difficult? Personally, I've got a lot of criteria that I go by. Can I improve on existing rips? Is there anything wrong with existing rips? Do I like this game? Does anyone want this game? How difficult is this game to work with? Sometimes I take all of those into account. Sometimes I take none of them into account and it's just a matter of getting a game done. Sonic Adventure 2 is a game that I really love, but I just never looked into it myself. I had gotten requests for it before, but I decided I'd actually take a look into it this time. I was unsure if I was going to, but then...


I was looking through the game's files and heard that. Suddenly it was 2002 (when I played the game) again and this song was awesome. So, that settled it. Sonic Adventure 2 it is.

2. Research
Most Dreamcast games don't require much work. Games didn't have any kind of copy protection aside from the Dreamcast verifying it was loading a GD-ROM until about July of 2000. A lot of games still didn't after that point, but you can bet that Sonic Adventure 2 (from June of 2001) has protections that must be cracked to boot from a CD-R.

So, how do I research that?

First, I go all the way back to the Echelon days and pull their NFO files. Echelon's NFOs don't usually provide any specifics on cracking, but they do give an idea of what you could potentially be dealing with. Next, check for patches. Since Echelon functioned as a scene group, they were basically racing other ripping groups to get theirs out first. This lead to mistakes, and they put out patches to fix them. Then I'll take a look at NFO files from contemporary rips. Once I feel that I've gathered enough info, I'll start looking through the game's files and getting an idea of what needs to go. NFO files assist with this as well because they give you a frame of reference. What did earlier groups have to remove to get everything down to size and how can modern ripping techniques negate the need to remove some of these files?

Now we have our starting point. From looking over everything we can learn that..
1. Sonic Adventure 2 is heavily protected
2. Due to the size of the game, the audio was always made mono instead of stereo
3. Videos had to be heavily downsampled
4. The Chao Garden is broken in basically all rips since they're based on Echelon's hack

From that, I'll make my goals...
1. Maintain stereo audio
2. Fix the Chao Garden
3. Add the DLC to the on-disc web browser so that it can still be accessed

Now, there are ways around the sizes of games. One common way to do it is a split release. A split release will break a game up into multiple discs with the NFO telling you when to switch to a different disc to continue with the game. My personal philosophy with ripping is to maintain the best possible quality for an 80 minute CD-R while maintaining the original number of discs. There are 99 minute discs, but frankly they're not easy for most people to get and burners that can handle them well are uncommon. The Dreamcast is common in less fortunate countries than MURRKA, and it's not fair for them to be left out of the fun. Aside from that, it's fun to set the bar high to really push yourself.

3. Ripping the Game
Now we're into the meat of the process. The actual work. For this process, I work in a few steps. First I'll take a look at file sizes and see where my largest problems are. At this point I'll usually hash all of the files and check for any duplicates. Once I know what I'm looking at in terms of data, I'll start moving files around between the two data sessions (assuming there's no CDDA) so I can start getting an idea of what I need to do to fit all of the files into the second data session, which is where the majority of the data is. Once I've got a rough idea, I'll start downsampling and squeezing everything in place. Finally, the game will be cracked and tested.

For Sonic Adventure 2, I started with the video. In the research phase, I learned that Sonic Adventure 2's video is unique. The Dreamcast's most common video format is MPEG-1 video multiplexed with ADX audio in a container called SFD. It's very similar to a basic MPEG container. Sonic Adventure 2's SFD container is actually very similar to a modern MKV or M4V container in that it contains multiple audio streams, a Japanese language stream and an English language stream. I had all of the correct tools to deal with this situation, but I immediately ran into a road bump. My typical process for demultiplexing the video was only giving me the first audio stream, the Japanese audio. This is where being flexible is handy.

After some research, I found a program called VGM Tool Box which has demultiplexers for a lot of different video formats common to games. This let me dump out both audio streams. I mentioned previously that this game has a huge amount of data, so first I had to deal with deleting the Japanese audio stream as much as I could. If it isn't present in some way, the Dreamcast won't play the English track because it would be multiplexed as the first track, which the Japanese track is. This was simple, just a matter of opening the audio track in a hex editor and deleting all of the data beyond the header. Now for the video itself. There are two different ways to handle MPEG-1 video with modern ripping. Back in the day, variable bit rate encoding wasn't really much of a thing. Everything was done with a constant bit rate, which meant that if you needed a max bitrate of 3000kbps to keep your video quality good, a black screen would still have that same bitrate. Variable lets the bitrate drop to near 0 if you have a black screen. This saves a lot of space.

Whenever I re-encode Dreamcast video, I always use a variable bitrate. The real question is a matter of encoder. If there's space to work with, I'll use a typical MPEG-1 encoder. If space is a pressing issue, then the big guns come out and I'll use a KVCD encoder for MPEG-1. KVCD is a very optimized encoding matrix that allows you to REALLY abuse the bitrate before there's much visible degradation of picture quality.

Obviously, we'll go with KVCD here. Initially, I encoded all of the video with 1200kbps KVCD and remuxed the files into SFDs with the ADX header file in place for the Japanese tracks so the English track would be positioned correctly.

With video out of the way (for now), it's time to turn to the audio. The research phase can be treacherous at times, and for Sonic Adventure 2 this was a good example. One NFO lead me to believe that Sonic Adventure 2 used modified ADX headers which stored loop data at different bytes, rendering the typical encoders which respect loop points useless. Loop points allow ADX files to loop endlessly, so this is important for a game. I wasted a day inspecting the header data and trying to determine how it differed.

The truth was that it wasn't different at all. I figured this out by just saying "fuck it" and going for it. Typically, Dreamcast games ADX files have a sampler ate of 44khz. Even Sonic Team found that their game was pushing it on space and the ADX files for Sonic Adventure 2 are 32khz. Usually you'll knock 44khz audio to 32khz for some quick and dirty downsampling, but that's out the window here. The only game I've seen with a lower audio quality is Ecco The Dolphin: Defender of the Future. It was 24khz for that game, which was fine since the soundtrack was synthesizer heavy. Since I would have to be reducing an already lower sample rate, quality became a concern. I found that 24khz was too low and the bass became very distorted. I settled in at 26khz as this worked best for the game's rock soundtrack.

With a game with so much data, some sacrifices have to be made. Japanese audio already had to go for the videos, so the Japanese voice files are going to have to go as well. A cool 75mb saved. Most games have a ton of duplicate files, we can save space there as well. Using hardlinking, we can make it so that only one copy of a file goes on the disc, but pointers are present which will redirect the system to the single copy of the file. 5mb down.

At this point, I'll usually make ISOs for both of the data sessions and merge them to a CDI to see how close everything is to fitting a CD-R and make a plan for proceeding. And Sonic Adventure 2 is.... 70mb over. Jesus Christ.

4. Back to the Drawing Board
I've cut the video and audio to a size I thought would fit, and I'm still over. Now we've got to start getting clever. One of the things I do for a rip is create a folder called EXTRAS and place data in that folder which would be found on part of the GD-ROM that could be read by a computer. A lot of the time this is empty but a few developers included desktop wallpapers and artwork from the game. Sonic Adventure 2 included some rather large wallpapers. Those will go first, an extras folder is just fine packed in an archive alongside the game.

Next I'll retrace my steps. First, back to the video. I use bat files for most of my work as most Dreamcast utilities are exes which must be run from the command line. Automating that saves a lot of time. Since I need as much space as possible, I went over each video individually and took the bitrate as low as it could possibly be while maintaining a fair level of detail and minimizing macrobloacking. It's important with KVCD that you watch every video after it's encoded. There are some problems with the encoder and if the bitrate drops too low you'll get some nasty glitchy macroblocks that pop up. This saved about 15mb.

Back to the audio now...  I know that 24khz is too low. I'm already at 26khz. This leaves me...25khz. So, that'll be it then. 25khs sounds close enough to 26khz in terms of quality and the bass isn't distorted like at 24khz. Redoing all of the audio saves me 10mb.

Creating another CDI tells me that I still need to lose about 30mb. Unfortunately, we have to remove one of my goals from the list. That goal is adding DLC to the web browser. The web browser in Dreamcast games functions as sort of a standalone thing. It's not coded into the game's executable. Rather, the game boots into the browser separately. We're desperate for space, so it'll have to go. It doesn't change how the game itself plays, so it's expendable in this world. Losing the browser and the wallpapers saves about 15mb.

This is the point of true desperation since 15mb needs to go. In this situation, I usually finalize the ISO for the first data session. I'll hardlink the files, build the ISO and see how much space I have before I'm at my size limit for the 45000 boot LBA. I'll cram as many files as I can and then purely focus on the second data session.

There's nothing else I can adjust at this point and I need 15mb of data gone. This leads into another stage of research. For games like Sonic Adventure 2 fans on the internet have really picked the game apart. If there's unused files, they've found them and documented them on a website for curiosity's sake. I'll comb over these sites and locate the unused files and remove them. For Sonic Adventure 2 this barely saved me a mb. It ended up being two songs related to the Chao garden and some textures as well. I also ended up removing two audio files related to online connectivity for Chao functions since the servers are now offline.

And we still need 14mb. Time to expand our horizons with hardlinking. Going over the audio, I found that two versions of one song were on the disc. One had a two second intro which featured a bass riff and one did not. I linked these two files so that the only one on disc was the one with the bass riff. A negligible change. 3.5mb saved. Each character has a theme song in Sonic Adventure 2, a long version and a short version. The difference between these tracks was about 20 seconds. I removed the longer versions in favor of the shorter versions for a few of the characters. We're now in business and the game fits to a disc.

The tightest I had ever packed a game was Ecco the Dolphin. There was about 5mb of space on the disc which I ended up dummying out. Sonic Adventure 2 had 700kb.

5. Cracking the Game
Our game now fits to a disc... but does the Dreamcast play it?

For Sega's games, there wasn't a lot of variety in protections. I've got a list of common protection schemes and what bytes they show up as in hex. I'll search the game's binary and fix them with the known fix. Sonic Adventure 2 fell into this category, thankfully. Echelon had some problems with it, and their binary has been reused for contemporary rips. Their binary is coded for an 11700 boot LBA though, and I use 45000 which replicates the GD-ROM's boot LBA. Since we're mimicking that you typically have to fuss with games less.

Now that our game fits to a disc and it's cracked, I'll do an emulator test with nullDC before burning for a hardware test. The emulator test lets me know if anything is horribly wrong. The emulator test is not always right though, and due to the inaccuracies emulators can have the game won't always fail in the emulator where it would in real hardware.

The game boots, the music loops and it looks like everything is good. Thankfully, it behaves identically on the actual hardware.

6. Testing
Most games don't require much testing due to their simplicity. I don't really trust Sonic Adventure 2 for that, though. In this situation, I'll modify the game's boot information to display a test release notice and distribute it to a few interested parties online. The test release notice is there purely so no one redistributes it as their own work. It hasn't happened and the people who offer to help are honest, but just in case...

This game needs to be tested to the end. I understand that parts of the ending end up streaming audio and video at the same time, which is problematic for CD-Rs since they read at a slower speed than GD-ROMs. I'm working through the game myself, and I'll use tester feedback as well. At this point the game is finished, but my concern is tweaking the order that the files are in on the disc. If issues crop up with streaming speed, I'll have to modify the way the files are sorted in order to improve the performance.

7. Release
Sonic Adventure 2 isn't yet at this point, but it's close. This is a simple matter of archiving the game and getting it out to my preferred sites.

The process is complete. We've met two of our goals:

1. The game has stereo audio
2. The Chao Garden works

Sometimes, all of the goals can't be met. Two out of three is pretty damn good though.