Monday 30 January 2017

A note on my new blog series: Writing a NES emulator in JavaScript

Note

Just to inform you that I have just started with a new blog series in which I will be implementing a NES emulator in JavaScript.

Here is the link:

http://nesemujs.blogspot.com

Enjoy!

Sunday 8 January 2017

Part 18: Exploring the SID

Foreword

In the previous post we have implemented sprites within our Android C64 emulator.

The addition of Sprites made the game DAN DARE fully playable within our Android C64 emulator.

The only thing left in these series of posts is adding sound.

I must admit that I was a bit lazy when busy with this post :-) Working on SID emulation from scratch can appear as quite a mountain to climb, especially if you are near the end of writing your C64 emulator.

To make the task bearable, I have decided to peek a bit at how other open source emulators have implemented SID emulation, use their code and modify it to fit within my emulator.

The emulator from which I have "peeked" from was Frodo C64. This is a portable C64 emulator that runs on a number of platforms, including Windows, Linux, AmigaOS and others.

There is even an Android fork of Frodo 64 on GitHub, here, by Arnaud Brochard. This fork, by the way, will be the source I will be focussing on the SID functionality in this post and how to make it fit within our emulator.

Frodo C64 SID emulation

Let us have a quick overview on how SID emulation is implemented within Frodo C64.

As discussed in the foreword, we are going to examine the fork from Arnaud Brochard.

One thing to take take note of is that the Frodo C64 source code is written in C++. Thus, to make the Frodo C64 source code fit within the C source code of my emulator, I had to do some tinkering.

In an attempt to stay focussed within this post, I am not going to discuss what tinkering was required for C->C++ conversion.

However, just keep in mind the C->C++ conversion when comparing my source code with FrodoC64 concerning SID emulation.

The two key source code files that we will be using from Frodo C64 for SID emulation is SID.cpp and SID_android.h.

Of course, there is other source code files providing JNI-glue providing our native code with the capability to access the Java-sound system. I will, however, discuss this JNI glue in the next section on integrating the Frodo SID emulation within our emulator.

Let us start having a look at the file SID.cpp.

The first function of importance within SID.cpp is calc_buffer. The purpose of this function is to provide an output buffer with audio samples that can be played on the Android sound system.

The calc_buffer function is the key function for SID emulation. It loops through all the SID voices and renders the applicable waveform samples for the applicable waveform. This function is also responsible for applying an envelope to the waveform with supplied ADSR parameter.

Let us have a quick look at how the samples is created for the different waveforms.

Here is some snippets from the waveform switch for getting a sample for the applicable waveform:

...
    case WAVE_TRI:
     if (v->ring)
      output = TriTable[(v->count ^ (v->mod_by->count & 0x800000)) >> 11];
     else
      output = TriTable[v->count >> 11];
     break;
    case WAVE_SAW:
     output = v->count >> 8;
     break;
    case WAVE_RECT:
     if (v->count > (uint32)(v->pw << 12))
      output = 0xffff;
     else
      output = 0;
     break;
...

    case WAVE_NOISE:
     if (v->count > 0x100000) {
      output = v->noise = sid_random() << 8;
      v->count &= 0xfffff;
     } else
      output = v->noise;
     break;
...

As you can see all waveforms calculate the output samples except the Triangular waveform, which uses a lookup table.I presume the lookup table for the triangle waveform was probably implemented for performance reasons.

The same triangle lookup table is used used to get triangle waveforms of different frequencies. To get a triangle waveform for the desired frequency you just need to skip a number of samples each time within the lookup table.

The same principle of sample skipping also applies for the other waveforms to get a waveform for the desired frequency.

Those familiar with the SID will recall that it allowed more than one waveform to be enabled per voice. This produced unintuitive results.

Frodo C64 also emulates these scenarios where more than one waveform is enabled per voice.  Frodo, however, also uses a lookup table to get the samples for these combined waveforms. From the comments it looks like samples was retrieved by recording the output of a physical SID with the applicable waveforms combined.

Another method of importance within SID.cpp is WriteRegister. This is the method you will use to delegate CPU writes to the SID memory region. calc_buffer ultimately also uses these values passed via WriteRegister to do the required rendering.

When integrating with our emulator, we will also be calling WriteRegister from memory.c when we encounter a write to the SID memory region. More on this in the next section.

This concludes our discussion on SID.cpp.

Let us now move on to SID_android.h.

This header file has some Android specifics regarding SID.

A method I want to highlight within this file is EmulateLine. Each time we are finished rendering a line on the display we need to invoke this method. Within this method

Each time EmulateLine is called a check is done whether enough line periods has elapsed, from which the total is enough to fill an audio buffer.

If the total period is long enough, we call calc_buffer and send the resulting output to the Java Sound System.

One might ask at this point in time how big the audio buffer should be made.

The following comment within SID_android.h gives us clue:

Note that too large buffers will not work
very well: The speed of the C64 is slowed down to an average speed of
100% by the blocking write() call in EmulateLine(). If you use a buffer
of, say 4096 bytes, that will happen only about every 4 frames, which
means that the emulation runs much faster in some frames, and much
slower in others.
On really fast machines, it might make sense to use an even smaller
buffer size.
The resulting size of the audio buffer that Frodo uses is 512 samples.

A sample buffer size of 512 samples translates to a period of more or less one VIC-II frame.

So, should it happen that you change a particular SID parameter a umber of times during the period of a VIC-II frame, only the last value will be use by calc_buffer.

Integration

I am now going to discuss how to integrate the Frodo SID functionality within our emulator.

First, we need to ad the following methods within FrontActivity:

    public void initAudio(int freq, int bits, int sound_packet_length) {
        if (audio == null) {
            audio = new AudioTrack(AudioManager.STREAM_MUSIC, freq, AudioFormat.CHANNEL_CONFIGURATION_MONO, 
            bits == 8?AudioFormat.ENCODING_PCM_8BIT: AudioFormat.ENCODING_PCM_16BIT, 
            freq==44100?32*1024:16*1024, AudioTrack.MODE_STREAM);
            audio.play();
        }
    }

    public void sendAudio(short data []) {
        if (audio != null) {
            long startTime = System.currentTimeMillis();
            audio.write(data, 0, data.length);
            long endTime = System.currentTimeMillis();
            System.out.println(endTime - startTime);

        }
    }

Frodo C64 contains similar java methods, but in a different place. In our case we do it within our MainActivity.

Within initAdio we create a global AudioTrack object called audio.

An AudioTrack object can accept samples via write(), which it will then play on the speakerof your Android device.

The sendAudio method receives a buffer of samples to output as sound. We wll invoke senAudio when we got a set of samples from cal_buffer.

Next up, we will be discussing the JNI glue required so our native code codecan communicate with the Java layer in order to send sound samples to play.

Up to know we have invoked native methods from Java code quite a number of times.

We have, however, not actually did the reverse, that is invoking Java methods from native code.

There is a bit of a headache involved when you want to invoke a Java method from native code: You need to have a JNIEnv instance.

A JNIEnv instance get explicitly send as the first function parameter when you invoke a native method from Java. One might therefore be tempted to think that one can just invoke a native method from Java during initialization and within this native method just store the JNIEnv instance as a global variable.

This approach has a limitation. A JNIEnv instance is only applicable to a particular thread.

Our application have two threads. One is the main GUI thread and the other one is the GLRenderer thread, which we will also use to generate SID sound samples.

So, if we set a global JNIEnv instance when our FrontActivity initialises, our GLRenderer thread will not able to use this instance.

All hope, however, is not lost.  We can still get a valid JNIEnv instance with the help of a JavaVM instance variable. A JavaVm instance, by the way, can be used by all threads running within the virtual machine.

Firstly, we need to get a instance of JavaVM in native code as follows:

jint JNI_OnLoad(JavaVM* aVm, void* aReserved) {
  gJavaVM = aVm;

  return JNI_VERSION_1_6;
}

The Virtual machine will invoke the JNI_Onload method within your native library when the virtual machine loads your native library.

This method can be in any of your c files, it doesn't matter. However, just ensure that you define this method once within your native code.

We can now get a JNIEnv variable that is applicable to our current thread that does the sound rendering.

The place I thought of placing this functionality was within EmulateLine of SID_android.h:

void EmulateLine()
{
...
  if (global_env == NULL) {
       (*gJavaVM)->AttachCurrentThread (gJavaVM, &global_env, NULL);
  }
...
}

This assumes you have defined global_env as a global variable within your native code.

We need to define a couple more JNI hooks. So, within memory.c define the following:

...
jobject currentActivity;
jmethodID initAudio;
jmethodID sendAudio;
...

void Java_com_johan_emulator_engine_Emu6502_setMainActivityObject(JNIEnv* env, jobject pObj, jobject activity) {

  currentActivity = (*env)->NewGlobalRef(env,activity);

  jclass thisClass = (*env)->GetObjectClass(env,currentActivity);

  initAudio = (*env)->GetMethodID(env, thisClass, "initAudio", "(III)V");
  sendAudio = (*env)->GetMethodID(env, thisClass, "sendAudio", "([S)V");


}


We will invoke this method on the onCreate in our FrontActivity and pass itself as a reference.

You will also see something interesting when we store this reference in our native method: We get a Global reference and then store this reference.

It should be remembered that object references passed to native methods from Java are local reference which only have a lifetime while the native method executes. Once the native method exits, this local reference isn't valid any more.

Let s continue with the rest of the code within this native method. We get the class for our instance and from that we get the Method IDS for initAdio and sendAudio which we store.

Note that MethodID's you can store and use gobally and don't need to worry about local/global references.

We now have all the important JNI hooks defined. Let us now see where it is used. All this happens within EmulateLine of SID_android.h. To get the overall picture, I am showing you the whole method:

void EmulateLine()
{
 static int divisor = 0;
 static int to_output = 0;
 static int buffer_pos = 0;
 static int loop_n = 2;
 static int loop_c = 0;
 if (!ready)
  return;

  if (global_env == NULL) {
       (*gJavaVM)->AttachCurrentThread (gJavaVM, &global_env, NULL);
  }

 sample_buf[sample_in_ptr] = volume;
 sample_in_ptr = (sample_in_ptr + 1) % SAMPLE_BUF_SIZE;

 // Now see how many samples have to be added for this line
 divisor += SAMPLE_FREQ;
 while (divisor >= 0)
  divisor -= 312*50, to_output++;

 // Calculate the sound data only when we have enough to fill
 // the buffer entirely
 if ((buffer_pos + to_output) >= sndbufsize) {

  int datalen = sndbufsize - buffer_pos;
  to_output -= datalen;
  calc_buffer(sound_buffer + buffer_pos, datalen*2);
  
  if (!audioarray)
  {
    jshortArray tempArrayRef =  (*global_env)->NewShortArray(global_env, sndbufsize*loop_n);
                  audioarray = (*global_env)->NewGlobalRef(global_env,tempArrayRef);
  }
  (*global_env)->SetShortArrayRegion(global_env, audioarray, loop_c*sndbufsize, sndbufsize, sound_buffer);
  loop_c++;
  if (loop_c == loop_n)
  {
   (*global_env)->CallVoidMethod(global_env, currentActivity, sendAudio, audioarray);
   loop_c = 0;
  }
  
  
  buffer_pos = 0;
  
 }
}


The bottom line of the highlighted code is to call the Java method sendAudio so that the produce sound samples can be send to the Java Sound system.

We have, however, a problem with the datatype of the native array we use to store the samples. We cannot just send it the Java method sendAudio as is.

Arrays in Java are also a Java object, so we nee to change a native array to a Java object before passing it to sendAudio.

We do this by first defining a Java array audioarray as jshortArray. We then call SetShortArrayRegion to copy the data between our native array and our Java array.

This concludes our Integration discussion.

A Test Run 

During a Test Run the sound played smoothly on my Android device.

There was, however, a side effect with the video output. Every now and again, a frame would freeze for a while.

Some closer investigation yielded that the issue was caused with one of the parameters passed when we create an AudioTrack instance.

The offending parameter was the second last one, specifying underlying buffer space. Value specified was 32kB.

In effect what was happening was while buffer still had space left, it quickly accepts 512 samples. The process continues till the buffer eventually fill up completely.

When the buffer is full, the write call on AudioTrack blocks until a certain portion of the samples has played making space for new samples.

This blocking caused the jerky graphics.

I resolved this issue by reducing the underlying buffer size to just double the size of the buffer returned by calc_buffer, so we are effectively double buffering.

In Summary

In this post we have implemented SID emulation within our emulator.
This concludes my series on writing a C64 emulator for Android.

Friday 30 December 2016

Part 17: Implementing Sprites

Foreword

In the previous post we have implemented the remaining graphic modes required by the game Dan Dare.

One thing, though, we haven't implemented for the game Dan Dare is Sprites.

Sprites on the C64 have some complexity of its own. A sprite can contain transparent pixels and either be shown behind text or in front of text.

These complexities of sprites can cause a couple of headaches when trying to implement within our emulator.

Luckily most Android devices comes shipped with a GPU that can relieve some of these complexities for us. For example, GPUs support a functionality called Alpha Blending that makes it easy for us to implement transparency.

Android surface functionality of the GPU via OpenGL ES. Therefore in this post I will also be talking a bit about OpenGL ES within the context of Android.

Finally, we will be implementing Sprites within our emulator using OpenGL ES.

Overview of OpenGL ES

OpenGL ES can be viewed as a branch of OpenGL.

OpenGL is an open standard for accessing GPU hardware in a standard way.

OpenGL ES also stems from OpenGL, but is optimised for mobile/embedded devices which have limitations on available memory, CPU and in general try to optimise battery life.

Implementing Stacked Rendering

To implement Spite rendering I will be using the same approach as in my JavaScript C64 emulator, here.

The approach I used in my JavaScript C64 emulator involved stacking a number of canvases on top of each other. I used the following canvasses:

  • Background
  • Back ground Sprites
  • Foreground
  • Foreground Sprites 

Each canvas has a Z-order attribute telling the renderer in which order the canvases should be stacked.

We can use the same approach in our Android C64 emulator with the help of OpenGL ES.

Each layer we draw as two textured triangles in a way that it forms a rectangle. To specify the stacking order we will be using different z-ordinate values for the different planes.

Apart from specifying the correct z-order for the stacking, it is also important to draw the planes in the correct order, that is, start with the Background layer and work your way through till you get to the Foreground Sprites layer. If you don't get your order right, alpha blending will not produce the desired effect, that is your foreground will appear as the background and vice versa.

We will now go into implementation details for stacking.

Defining a new Surface

Up to now, we have used a SurfaceView to display the output of our emulator on the screen. However, since we want to go the OpenGL ES route, this will need to change.

We need to define a GLSurfaceView that OpenGL can use to draw to the screen. So, within content_front.xml we need to make the following change:

...
    <android.opengl.GLSurfaceView
        android:id="@+id/Video"
        android:layout_width="match_parent"
        android:layout_height="100px" />
...

Next, we need to add a renderer to the GLSurfaceView. So, within FrontActivity.java we need to make the following changes:

...
    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_front);

        GLSurfaceView mGLSurfaceView = (GLSurfaceView) findViewById(R.id.Video);

        mGLSurfaceView.setEGLContextClientVersion(2);
        MyGL20Renderer myRenderer = new MyGL20Renderer(this);
        mGLSurfaceView.setRenderer(myRenderer);
...
     }
...

We make a call to setEGLContextClientVersion to set the OpenGL version to 2.

MyGL20Renderer is a class we need to define that will perform the necessary drawing. The following important methods are defined within this class:

    public void onDrawFrame(GL10 unused)
    {
        GLES20.glDisable(GLES20.GL_DEPTH_TEST);

        GLES20.glEnable(GLES20.GL_BLEND);
        GLES20.glBlendFunc(GLES20.GL_SRC_ALPHA, GLES20.GL_ONE_MINUS_SRC_ALPHA);

        GLES20.glDepthMask(false);
        GLES20.glClear(GLES20.GL_COLOR_BUFFER_BIT);

        //Set the camera position (View Matrix)
        Matrix.setLookAtM(mVMatrix, 0, 0, 0, 3, 0f, 0f, 0f, 0f, 1.0f, 0.0f);

        //Calculate the projection and view transformation
        Matrix.multiplyMM(mMVPMatrix, 0, mProjMatrix, 0, mVMatrix, 0);

        //Create a rotation transformation for the triangle
        Matrix.setRotateM(mRotationMatrix, 0, mAngle, 0, 0, -1.0f);

        //Combine the rotation matrix with the projection and camera view
        Matrix.multiplyMM(mMVPMatrix, 0, mRotationMatrix, 0, mMVPMatrix, 0);
        GLES20.glEnable(GLES20.GL_BLEND);

        emuInstance.clearDisplayBuffer();
        emuInstance.runBatch(0);
        sprite.Draw(mMVPMatrix, byteBuffer);
    }

    public void onSurfaceChanged(GL10 unused, int width, int height)
    {
        GLES20.glViewport(0, 0, width, height);

        float ratio = (float) width / height;

        Matrix.orthoM(mProjMatrix,0, -ratio, ratio, -1, 1, 3, 7);
    }


The onSurfaceChanged method gets invoked when the surface gets created and when the dimensions of the screen changed, like when turning the device from a portrait orientation to a landscape orientation.

In onSurfaceChanged we apply an Orthographic Projection Matrix. An Orthographic Projection Matrix is in contrast with a Perspective Projection Matrix. The latter matrix makes distant objects looks smaller whereas the former everything looks the same size irrespective of distance.

Strictly speaking don't need to apply a matrix for Orthographic projection. We, however, need to apply a matrix to get the aspect ratio right. Without it the graphics will look squashed.

There is a lot of things happening within the method onDrawFrame. Before we start to discuss what is going on within this method, it may be worthwhile to mention that this method gets invoked by the OpenGL subsystem. So we don't really have a concrete way to control the framerate of our application, except maybe if you add random sleeps to slow down the app to the required rate. I will, however not cover speed control in this post.

Back to the detail of onDrawFrame. In the first couple of lines we enable alpha blending. You will also see that we disable Depth testing, since it doesn't make a difference in our world.

We also invoke setLookAtM to change the orientation of the camera to three units in front of our layers.

You will also see that we now invoke runBatch from this method.

The actual drawing happens within sprite.Draw(). Sprite is also one of the new classes we have created. There is lot of things happening in this class, so I will try and give a summary on what happens within this class.

Texture Rendering

As explained previously, we will draw the screen as 4 different layers with Alpha blending.

Each layer will be drawn as a rectangle filled with a texture. Theoretically this means that we need to supply four textures. It should be noted, though, that a bit of overhead is associated with each texture transfer.

To reduce the overhead a bit, we can combined the four textures into one large texture. As an example, look at the following:



The first picture is the resulting picture. The next picture shows the combined textured image. I have used the color magenta to indicate the transparent pixels.

When drawing the different layers, you just need to ensure that the texture coordinates takes the portion of the combined texture that is applicable to the layer. To get a feel for the texture coordinates, have a look at the following definition:

        final float[] cubeTextureCoordinateData =
                {
                        //Sprite Foreground
                        0.0f, 0.0f, // top left
                        0.0f, 1.0f, // bottom left
                        0.258426966f, 1.0f, // bottom right
                        0.258426966f, 0.0f, //top right

                        //Foreground
                        0.258426966f, 0.0f, // top left
                        0.258426966f, 1.0f, // bottom left
                        0.516853932f, 1.0f, // bottom right
                        0.516853932f, 0.0f, //top right

                        //Sprite Background
                        0.516853932f, 0.0f, // top left
                        0.516853932f, 1.0f, // bottom left
                        0.775280898f, 1.0f, // bottom right
                        0.775280898f, 0.0f, //top right


                        //Background
                        0.775280898f, 0.166666f, // top left
                        0.775280898f, 0.833333f, // bottom left
                        1.0f, 0.833333f, // bottom right
                        1.0f, 0.166666f //top right
                };

Texture coordinates are in the range 0-1 left to right, as well as top to down.

Going Native

Time for us to modify our native code for drawing sprites.

Up to now our native code wrote the video output to a native buffer with size 368 by 300 pixels.

However, as seen in the previous section, we will need to have a larger buffer containing a combination of all four layers. Therefore, the dimensions of  our native buffer will need to change to 1424x300. This dimension change require the following change within FrontActivity.java:

...
    @Override
    protected void onCreate(Bundle savedInstanceState) {
...
       mByteBuffer = ByteBuffer.allocateDirect(
                (368 + //foreground sprite
                 368 + //front
                 368 + //backround sprite
                 320 //background

                 ) * (300) * 4);
...
    }
...

Now, to the native code.

We start off by adding the following definitions to video.c:

...
#define STRIDE 368 + 368 + 368 + 320

jchar colors_RGB_888[16][3] = {
{0, 0, 0},
                  {255, 255, 255},
                  {136, 0, 0},
                  {170, 255, 238},
                  {204, 68, 204},
                  {0, 204, 85},
                  {0, 0, 170},
                  {238, 238, 119},
                  {221, 136, 85},
                  {102, 68, 0},
                  {255, 119, 119},
                  {51, 51, 51},
                  {119, 119, 119},
                  {170, 255, 102},
                  {0, 136, 255},
                  {187, 187, 187}
};

jchar colors_RGB_565[16];

jint colors_RGB_8888[16];

void initialise_video() {
  int i;
...
  for (i=0; i < 16; i++) {
    colors_RGB_8888[i] = (255 << 24) | (colors_RGB_888[i][2] << 16) | (colors_RGB_888[i][1] << 8) | (colors_RGB_888[i][0] << 0);
    //colors_RGB_8888[i] = (255);
  }
}
...

We begin with a definition of stride. This is the full pixel length of a line. We use this constant to advance line by line. Off course, at any point in time we will only be working with a subsection of a line, depending with which layer we are busy with.

We also define a new color tablet colors_RGB_8888. This is basically our color tablet with a color alpha channel for each color. When we populate this array within initialise video, we set the alpha channel for each color to fully opaque so that we don't need to worry about this later on.

Next, we write the following code to keep track of the different positions within the buffer:

...
int startOfLineTxtBuffer = 0;
int startOfFrontSpriteBuffer = 0;
int startOfBackgroundSpriteBuffer = 0;
int posInFrontBuffer = 0;
int posInBackgroundBuffer = 0;
...
static inline void processLine() {
  if (line_count > 299)
    return;

  posInFrontBuffer = startOfLineTxtBuffer + 368;
  posInBackgroundBuffer = startOfLineTxtBuffer + 368 + 368 + 368;

  startOfFrontSpriteBuffer = startOfLineTxtBuffer;
  startOfBackgroundSpriteBuffer = startOfLineTxtBuffer + 368 + 368;
...
  startOfLineTxtBuffer = startOfLineTxtBuffer + STRIDE;
}

Just for clarity, Txt stands for texture.

The code for the graphic modes will not change a lot except for variable name changes. The graphic modes will now need to use and update the variable names posInFrontBuffer and posInBackgroundBuffer.

Let us now move on to the Sprite code.

Part of the process of drawing a sprite involves a lot of back and forth between IO memory locations. I have therefore decided to create a method within memory.c that will do all the backwards and forwards between IO locations and return the necessary info as a data structure. The definition of this datastructure is as follows:

struct sprite_data_struct {
    int sprite_data;
    int sprite_type; //bit 1: xExpanded bit 0: multicolor
    int isForegroundSprite;
    int color_tablet[4];
    int sprite_x_pos;
    int number_pixels_to_draw;
};

Here is a description of the different fields:

  • sprite_data: the three bytes of data for the required line within the sprite
  • sprite_type: two bits indicating whether the sprite is xExpanded and multi colored
  • isForegroundSprite: whether the sprite is a foreground or background sprite
  • color_tablet: colors for the different bit combinations. If it is not a multi colored sprite only index one will be populated
  • sprite_pos_x: x Position of sprite
  • number_pixels_to_draw: Either 24 or 48 depending whether sprite is X-Expanded
And now for the implementation of this method within memory.c:

int processSprite(int spriteNum, int lineNumber, struct sprite_data_struct * sprite_data) {
  if (!(IOUnclaimed[0x15] & (1 << spriteNum)))
    return 0;

  int spriteY = IOUnclaimed[(spriteNum << 1) | 1];
  int yExpanded = IOUnclaimed[0x17] & (1 << spriteNum);
  int ySpriteDimension = yExpanded ? 42 : 21;
  int spriteYMax = spriteY + ySpriteDimension;

  if (!((lineNumber >= spriteY) && (lineNumber < spriteYMax)))
    return 0;

  int spriteX = (IOUnclaimed[spriteNum << 1] & 0xff);
  if (IOUnclaimed[0x10] & (1 << spriteNum))
    spriteX = 256 | spriteX;

  if (spriteX > 367)
    return 0;

  int xExpanded = IOUnclaimed[0x1d] & (1 << spriteNum);
  int xSpriteDimension = xExpanded ? 48 : 24;
  int spriteXMax = spriteX + xSpriteDimension;

  if (spriteXMax > 367)
    xSpriteDimension = 368 - spriteX;

  sprite_data->sprite_x_pos = spriteX;
  sprite_data->number_pixels_to_draw = xSpriteDimension;
  sprite_data->sprite_type = 0;
  if (xExpanded)
    sprite_data->sprite_type = sprite_data->sprite_type | 2;

  if (IOUnclaimed[0x1c] & (1 << spriteNum)) {
    sprite_data->sprite_type = sprite_data->sprite_type | 1;
    sprite_data->color_tablet[1] = IOUnclaimed[0x25] & 0xf;
    sprite_data->color_tablet[2] = IOUnclaimed[0x27 + spriteNum] & 0xf;
    sprite_data->color_tablet[3] = IOUnclaimed[0x26] & 0xf;
  } else {
    sprite_data->color_tablet[1] = IOUnclaimed[0x27 + spriteNum] & 0xf;
  }

  if (IOUnclaimed[0x1b] & (1 << spriteNum))
    sprite_data->isForegroundSprite = IOUnclaimed[0x1b] & (1 << spriteNum) ? 0 : 1;

  int memPointer = IOUnclaimed[0x18];
  int spritePointerAddress = memPointer & 0xf0;
  spritePointerAddress = spritePointerAddress << 6;

  spritePointerAddress = ((~IOUnclaimed[0xd00] & 3) << 14) | spritePointerAddress;
  spritePointerAddress = spritePointerAddress + 0x400 -8 + spriteNum;
  int spriteBaseAddress = mainMem[spritePointerAddress] << 6;


  int spriteLineNumber = lineNumber - spriteY;

  if (yExpanded)
    spriteLineNumber = spriteLineNumber >> 1;

  int posInSpriteData = (spriteLineNumber << 1) + (spriteLineNumber) + spriteBaseAddress;
  sprite_data->sprite_data = (mainMem[posInSpriteData + 0] << 16) | (mainMem[posInSpriteData + 1] << 8)
          | (mainMem[posInSpriteData + 2] << 0);

  return 1;

}

As parameters we accept the Sprite Number in question, current line number and a pointer to a sprite_data_struct in which we should return the required data.

We also have a boolean return type indicating whether the current sprite should be drawn. If any kind of reason is found in which the sprite shouldn't be drawn, we immediately return false.

Let us now define the main process flow of processing sprites within video.c:

void processSprites() {
  int i;
  struct sprite_data_struct currentSpriteData;
  for (i = 0; i < 8; i++) {
    int currentSpriteNum = 7 - i;
    if (processSprite(currentSpriteNum, line_count, &currentSpriteData)) {
      spriteFunctions[currentSpriteData.sprite_type] (currentSpriteData);
    }
  }
}

static inline void processLine() {
  if (line_count > 299)
    return;

  posInFrontBuffer = startOfLineTxtBuffer + 368;
  posInBackgroundBuffer = startOfLineTxtBuffer + 368 + 368 + 368;

  startOfFrontSpriteBuffer = startOfLineTxtBuffer;
  startOfBackgroundSpriteBuffer = startOfLineTxtBuffer + 368 + 368;

  updatelineCharPos();
  fillColor(24, memory_unclaimed_io_read(0xd020) & 0xf);
  int screenEnabled = (memory_unclaimed_io_read(0xd011) & 0x10) ? 1 : 0;
  if (screenLineRegion && screenEnabled) {
    processSprites();
...
  }
...
}

For each line that we process we call processSprites in which we loop through all the sprites calling processSprite in turn which we have previously defined within video.c.

We only bother with a sprite if processSprite returned true.

You will see something interesting in the if statement where we test the return value of processSprite. When we enter this if-statement, one of four possible actions is supposed to be called, depending on whether is X-Expanded or multi-coloured.

In the usual case we would use a four case switch statement. However, to save some CPU cycles, we can make use of a four element function pointer array for lookup. We declare and populate this array as follows:

...
void (*spriteFunctions[4]) (struct sprite_data_struct spriteData);

void drawExpandedMulticolorSpriteLine(struct sprite_data_struct spriteData);
void drawUnExpandedMulticolorSpriteLine(struct sprite_data_struct spriteData);
void drawExpandedNormalSpriteLine(struct sprite_data_struct spriteData);
void drawUnExpandedNormalSpriteLine(struct sprite_data_struct spriteData);
...
void initialise_video() {
...
  spriteFunctions[0] = &drawUnExpandedNormalSpriteLine;
  spriteFunctions[1] = &drawUnExpandedMulticolorSpriteLine;
  spriteFunctions[2] = &drawExpandedNormalSpriteLine;
  spriteFunctions[3] = &drawExpandedMulticolorSpriteLine;
}
...


Above sprite functions is defined as follows:

void drawUnExpandedNormalSpriteLine(struct sprite_data_struct currentSpriteData) {
      int currentPosInSpriteBuffer;
      if (currentSpriteData.isForegroundSprite)
        currentPosInSpriteBuffer = startOfFrontSpriteBuffer;
      else
        currentPosInSpriteBuffer = startOfBackgroundSpriteBuffer;
      currentPosInSpriteBuffer = currentPosInSpriteBuffer + currentSpriteData.sprite_x_pos;
      int j;
      int spriteData = currentSpriteData.sprite_data;
      int upperLimit = currentPosInSpriteBuffer + currentSpriteData.number_pixels_to_draw;
      for (j = currentPosInSpriteBuffer; j < (upperLimit); j++) {
        if (spriteData & 0x800000) {
          g_buffer[currentPosInSpriteBuffer] = colors_RGB_8888[currentSpriteData.color_tablet[1]];
        }
        spriteData = (spriteData << 1) & 0xffffff;
        currentPosInSpriteBuffer++;
      }

}

void drawExpandedNormalSpriteLine(struct sprite_data_struct currentSpriteData) {
      int currentPosInSpriteBuffer;
      if (currentSpriteData.isForegroundSprite)
        currentPosInSpriteBuffer = startOfFrontSpriteBuffer;
      else
        currentPosInSpriteBuffer = startOfBackgroundSpriteBuffer;
      currentPosInSpriteBuffer = currentPosInSpriteBuffer + currentSpriteData.sprite_x_pos;
      int j;
      int spriteData = currentSpriteData.sprite_data;
      for (j = 0; j < (currentSpriteData.number_pixels_to_draw >> 1); j++) {
        if (spriteData & 0x800000) {
          g_buffer[currentPosInSpriteBuffer + 0] = colors_RGB_8888[currentSpriteData.color_tablet[1]];
          g_buffer[currentPosInSpriteBuffer + 1] = colors_RGB_8888[currentSpriteData.color_tablet[1]];
        }
        currentPosInSpriteBuffer = currentPosInSpriteBuffer + 2;
        spriteData = (spriteData << 1) & 0xffffff;
      }

}

void drawUnExpandedMulticolorSpriteLine(struct sprite_data_struct currentSpriteData) {
      int currentPosInSpriteBuffer;
      if (currentSpriteData.isForegroundSprite)
        currentPosInSpriteBuffer = startOfFrontSpriteBuffer;
      else
        currentPosInSpriteBuffer = startOfBackgroundSpriteBuffer;
      currentPosInSpriteBuffer = currentPosInSpriteBuffer + currentSpriteData.sprite_x_pos;
      int j;
      int spriteData = currentSpriteData.sprite_data;
      for (j = 0; j < (currentSpriteData.number_pixels_to_draw >> 1); j++) {
        int pixels = (spriteData & 0xC00000) >> 22;
        if (pixels > 0) {
          g_buffer[currentPosInSpriteBuffer + 0] = colors_RGB_8888[currentSpriteData.color_tablet[pixels]];
          g_buffer[currentPosInSpriteBuffer + 1] = colors_RGB_8888[currentSpriteData.color_tablet[pixels]];
        }
        currentPosInSpriteBuffer = currentPosInSpriteBuffer + 2;
        spriteData = (spriteData << 2) & 0xffffff;
      }

}

void drawExpandedMulticolorSpriteLine(struct sprite_data_struct currentSpriteData) {
      int currentPosInSpriteBuffer;
      if (currentSpriteData.isForegroundSprite)
        currentPosInSpriteBuffer = startOfFrontSpriteBuffer;
      else
        currentPosInSpriteBuffer = startOfBackgroundSpriteBuffer;
      currentPosInSpriteBuffer = currentPosInSpriteBuffer + currentSpriteData.sprite_x_pos;
      int j;
      int spriteData = currentSpriteData.sprite_data;
      for (j = 0; j < (currentSpriteData.number_pixels_to_draw >> 2); j++) {
        int pixels = (spriteData & 0xC00000) >> 22;
        if (pixels > 0) {
          g_buffer[currentPosInSpriteBuffer + 0] = colors_RGB_8888[currentSpriteData.color_tablet[pixels]];
          g_buffer[currentPosInSpriteBuffer + 1] = colors_RGB_8888[currentSpriteData.color_tablet[pixels]];
          g_buffer[currentPosInSpriteBuffer + 2] = colors_RGB_8888[currentSpriteData.color_tablet[pixels]];
          g_buffer[currentPosInSpriteBuffer + 3] = colors_RGB_8888[currentSpriteData.color_tablet[pixels]];
        }
        currentPosInSpriteBuffer = currentPosInSpriteBuffer + 4;
        spriteData = (spriteData << 2) & 0xffffff;
      }

}

This more or less conclude the development required for displaying sprites in our emulator.

A Test Run

With all the changes I did a test run.

All the sprite graphics came out pretty much as expected.

As mentioned before, I haven't made any attempt to add delays to slow down to real speed. Despite all the extra functionality added I still experienced a faster than 1.0X speed.

The faster than 1.0X speed gives us some confidence that we have some CPU room left to do other stuff, like implementing SID sound.

Let us end this section off with some screenshots.





In Summary

In this post we have implemented sprites with the help of OpenGL ES to assist with transparency.

Well, I think we almost got to a point of concluding this series on creating an Android C64 emulator. In fact, when I did my JavaScript emulator series, the sprites post was my final post.

However, after some contemplation the last couple of days, I decided to add a couple of more posts regarding adding SID sound.

So, in the next post I am planning to start exploring the SID with the view of implementing sound within our Android C64 emulator.

Till next time!

Thursday 15 December 2016

Part 16: Implementing the other Graphic Modes

Foreword

In the previous post we have jacked up the video rendering of our emulator a quite a bit. Here is a summarised list of the rendering enhancements:

  • Adding color and surrounding border
  • Implemented scanline rendering

With above mentioned enhancements we could manage to simulate the flashing borders with stripes when we loading the Game Dan Dare from the tape image.

There was, however, still a couple of things that didn't render correctly while the game was loading:

  • No splash screen
  • Intro screen looked a bit garbled
In this post we will be trying to solve above mentioned issues. This journey will involve implementing more of the graphics modes of the VIC-II. We will also need to properly implement the memory model of the VIC-II within our emulator.

To aid us in testing, it will also make sense to implement joystick emulation to our emulator. Without joystick emulation we will not be able to get past the intro screen to evaluate some of the other graphic modes.


Joystick Emulation

Let us start with implementing joystick emulation.

The basic idea is to draw the joystick on the screen of the Android Device and then check when user have selected a joystick direction or pressed the fire button.

The on screen joystick will look as follows:


In effect one control for selecting direction and another one for fire.

You will notice that for the directional control that I have catered for eight directions instead of four. This just makes handling on the touch screen easier when you need to go in a diagonal direction while simultaneously need to press the firebutton.

Our directional control can be described as a track divided into sectors. These sectors are spaced from each other by a couple of degrees.

So, how do we determine which direction is pressed down? To answer this question we need view the directional control as a circle and ask: Where are we on the circle?

An Android touch screen provides the current position of the user's finger on the screen as an (x,y) coordinate. First of all, we need to transform this coordinate to a coordinate relative to the centre of the directional control circle.

From our school days, such a transformed coordinate can also be thought of as a rectangular coordinate.

In order for us to determine which direction was selected we need to convert this rectangular coordinate to polar form, that is getting an angle in degrees and radius. We can then determine which direction was selected with the following table:


  • 112.5 Degrees - 67.5 Degrees = North
  • 67.5 Degrees - 22.5 Degrees = North East
  • 22.5 Degrees - (-22.5 Degrees) = East
  • (-22.5 Degrees) - (-67.5 Degrees) = South East
  • (-67.5 Degrees) - (-112.5) = South
  • (-112.5 Degrees) - (-157 Degrees) = South West
  • (-157 Degrees) - (-202.50 Degrees) = West
  • (-202.5 Degrees) - (-247.5 Degrees) = North West
The negative angles may look a bit weird. Just remember that we start at North and moving clockwise. Once going passed 0 Degrees, we will use encounter Negative angles, rather than counter from 360 Degrees.

All this logic will happen in the JoystickView class. To keep focused I will not discuss the technical details of this class. Maybe just remember that this class outputs the selected angle as an integer between 0 and 7. 0 is north and works in a clockwise direction until with number 7 you are at North West, as followed with the above mentioned table. 

Let us do some coding. Within memory.c add the following methods:

void Java_com_johan_emulator_engine_Emu6502_setFireButton(JNIEnv* pEnv, jobject pObj, jint fireButtonStatus) {
  if (fireButtonStatus)
    joystickStatus = joystickStatus | 16;
  else
    joystickStatus = joystickStatus & 0xef;
}

void Java_com_johan_emulator_engine_Emu6502_setJoystickDirectionButton(JNIEnv* pEnv, jobject pObj, jint fireButtonStatus) {
  //start at north, then goes clockwise till again at north
  joystickStatus = joystickStatus & 0xf8;
  switch (fireButtonStatus) {
    case 0: //North
      joystickStatus = joystickStatus | JOYSTICK_UP;
    break;

    case 1: //North East
      joystickStatus = joystickStatus | JOYSTICK_UP | JOYSTICK_RIGHT;
    break;

    case 2: //East
      joystickStatus = joystickStatus | JOYSTICK_RIGHT;
    break;

    case 3: //South East
      joystickStatus = joystickStatus | JOYSTICK_DOWN | JOYSTICK_RIGHT;
    break;

    case 4: //South
      joystickStatus = joystickStatus | JOYSTICK_DOWN;
    break;

    case 5: //South West
      joystickStatus = joystickStatus | JOYSTICK_DOWN | JOYSTICK_LEFT;
    break;

    case 6: //West
      joystickStatus = joystickStatus | JOYSTICK_LEFT;
    break;

    case 7: //North West
      joystickStatus = joystickStatus | JOYSTICK_LEFT | JOYSTICK_UP;
    break;

    default:
    break;

  }
}


When the user press or release either the fire button or one the directional buttons, one of these methods will be invoked.

The joystickStatus has the same format as the joystick byte at memory location DC00. Finally, we need to make the following change to cia_read:

jchar cia1_read(int address) {
  jchar result = 0;
  switch (address) {
    case 0xdc00:
      result =  ~joystickStatus & 0xff ;
    break;

    case 0xdc01:
      result = getKeyPortByte(mainMem[0xdc00]);
    break;
...
  }
...
} 

The VIC II Memory Model

Up to now we have hardcoded some assumptions into our emulator regarding VIC II memory access. The first assumption is character memory start at memory location 1024 and other assumption is that the character bitmap images is always retrieved from character ROM.

These assumptions will be sufficient for us to continue, so we will need to properly implement the VIC memory module.

We will use the following snippet of code to calculate the correct Video memory and character memory:

  int memPointer = memory_unclaimed_io_read(0xd018);
  int videoMemoryBase = memPointer & 0xf0;
  videoMemoryBase = videoMemoryBase << 6;
  int charROMBase = memPointer & 0xe;
  charROMBase = charROMBase << 10;

This snippet of code, however, will yield only 14-bit wide addresses. The other 2 bits is provided by location DD00 in the IO bank. The following methods within memory.c make use of location DD00:

void memory_read_batch(int *batch, int address, int count) {
  address = ((~IOUnclaimed[0xd00] & 3) << 14) | address;
  int i;
  for (i = 0; i < count; i++) {
    if ((address >= 0x1000 && address < 0x2000) || (address >= 0x9000 && address < 0xa000))
      batch[i] = charRom[address & 0xfff];
    else
      batch[i] = mainMem[address + i];
  }
}

jchar memory_read_vic_model(int address) {
  address = ((~IOUnclaimed[0xd00] & 3) << 14) | address;
  if ((address >= 0x1000 && address < 0x2000) || (address >= 0x9000 && address < 0xa000))
    return charRom[address & 0xfff];
  else
    return mainMem[address];

}


It should be noted that in the VIC-II memory model, the memory ranges 1000-2000 and 9000-A000 will always point to a location in Character ROM.

Implementing the Other Graphic Modes

Currently our processLine method within video.c only works with one graphics mode. To cater for the other modes, we need to add a case statement as follows:

static inline void processLine() {
  if (line_count > 299)
    return;

  updatelineCharPos();
  fillColor(24, memory_unclaimed_io_read(0xd020) & 0xf);
  int screenEnabled = (memory_unclaimed_io_read(0xd011) & 0x10) ? 1 : 0;
  if (screenLineRegion && screenEnabled) {
    jchar bitmapMode = (memory_unclaimed_io_read(0xd011) & 0x20) ? 1 : 0;
    jchar multiColorMode = (memory_unclaimed_io_read(0xd016) & 0x10) ? 1 : 0;
    jchar screenMode = (bitmapMode << 1) | (multiColorMode);
    switch (screenMode) {
      case 0: //Normal texmode
        drawScreenLineNormalText();
      break;

      case 1: //Multi color text mode
        drawScreenLineMultiColorText();
      break;

      case 2: //Standard bitmap mode
      break;

      case 3: //Multicolor bitmap
        drawScreenLineMultiColorBitmap();
      break;
    }

  } else {
    fillColor(320, memory_unclaimed_io_read(0xd020) & 0xf);
  }
  fillColor(24, memory_unclaimed_io_read(0xd020) & 0xf);
}

We form a 2-bit number, screenMode, from the bitmapMode bit and multicolorMode bit from locations D011 and D016 respectively. It is this 2-bit number that we use for the switch selector.

As you can see, we have implemented all the graphics modes except standard bitmap mode. This is because we don't need it yet for the game we are emulating.

Let us now look at the implementation of these graphic modes.

Standard text mode we have already covered in the previous post, so I will not be covering it here.

Let us look at the implementation of multi color text mode:

static inline void drawScreenLineMultiColorText() {
  int i;
  int batchCharMem[40];
  int batchColorMem[40];
  int color_tablet[4];
  int memPointer = memory_unclaimed_io_read(0xd018);
  int videoMemoryBase = memPointer & 0xf0;
  videoMemoryBase = videoMemoryBase << 6;
  int charROMBase = memPointer & 0xe;
  charROMBase = charROMBase << 10;

  int backgroundColor = memory_unclaimed_io_read(0xd021) & 0xf;
  memory_read_batch(batchCharMem, videoMemoryBase + posInCharMem, 40);
  memory_read_batch_io_unclaimed(batchColorMem, 0xd800 + posInCharMem, 40);
  for (i = 0; i < 40; i++) {
    jchar charcode = batchCharMem[i];//memory_read(1024 + i + posInCharMem);
    int bitmapDataRow = memory_read_vic_model(((charcode << 3) | (line_in_visible & 7)) + charROMBase);
    int j;
    int foregroundColor = batchColorMem[i] & 0xf;//memory_read(0xd800 + i + posInCharMem) & 0xf;
    if (foregroundColor & 8) {
      foregroundColor = foregroundColor & 7;
      color_tablet[0] = backgroundColor;
      color_tablet[1] = memory_unclaimed_io_read(0xd022) & 0xf;
      color_tablet[2] = memory_unclaimed_io_read(0xd023) & 0xf;
      color_tablet[3] = foregroundColor;
          for (j = 0; j < 4; j++) {
            int pixelSet = bitmapDataRow & 0xc0;
            pixelSet = pixelSet >> 6;

            g_buffer[posInBuffer] = colors_RGB_565[color_tablet[pixelSet]];
            posInBuffer++;

            g_buffer[posInBuffer] = colors_RGB_565[color_tablet[pixelSet]];
            posInBuffer++;

            bitmapDataRow = bitmapDataRow << 2;
          }

    } else {
      for (j = 0; j < 8; j++) {
        foregroundColor = foregroundColor & 7;
        int pixelSet = bitmapDataRow & 0x80;
        if (pixelSet) {
          g_buffer[posInBuffer] = colors_RGB_565[foregroundColor];
        } else {
          g_buffer[posInBuffer] = colors_RGB_565[backgroundColor];
        }
        posInBuffer++;
        bitmapDataRow = bitmapDataRow << 1;
      }
    }
  }
}

For each character to display we build up a four color tablet mapping to each of the four combinations of a pixel pair. For multi color text Mode, the color entries are defined as follows:


  • Entry 0: Color stored in D021
  • Entry 1: Color stored in D022
  • Entry 2: Color stored in D023
  • Entry 3: Foreground color, that is color stored in color RAM for this character

Multi color text mode has an interesting caveat. If bit 3 of the color stored in character RAM for the applicable character is set to zero, then that character will be displayed in standard bitmap mode. We test for this scenario via the outer if-statement.

Finally, let us look at the implementation of multi color bitmap mode:

static inline void drawScreenLineMultiColorBitmap() {
  int i;
  int batchCharMem[40];
  int batchColorMem[40];
  int memPointer = memory_unclaimed_io_read(0xd018);
  int videoMemoryBase = memPointer & 0xf0;
  videoMemoryBase = videoMemoryBase << 6;
  int charROMBase = memPointer & 0xe;
  charROMBase = charROMBase << 10;
  int color_tablet[4];
  color_tablet[0] = memory_unclaimed_io_read(0xd021) & 0xf;
  //int backgroundColor = memory_unclaimed_io_read(0xd021) & 0xf;
  memory_read_batch(batchCharMem, videoMemoryBase + posInCharMem, 40);
  memory_read_batch_io_unclaimed(batchColorMem, 0xd800 + posInCharMem, 40);
  for (i = 0; i < 40; i++) {
    jchar charcode = batchCharMem[i];//memory_read(1024 + i + posInCharMem);
    color_tablet[1] = charcode >> 4;
    color_tablet[2] = charcode & 0xf;
    color_tablet[3] = batchColorMem[i] & 0xf;
    int bitmapDataRow = memory_read_vic_model((((posInCharMem + i) << 3) | (line_in_visible & 7)) + charROMBase);
    int j;
    //int foregroundColor = batchColorMem[i] & 0xf;//memory_read(0xd800 + i + posInCharMem) & 0xf;

    for (j = 0; j < 4; j++) {
      int pixelSet = bitmapDataRow & 0xc0;
      pixelSet = pixelSet >> 6;

      g_buffer[posInBuffer] = colors_RGB_565[color_tablet[pixelSet]];
      posInBuffer++;

      g_buffer[posInBuffer] = colors_RGB_565[color_tablet[pixelSet]];
      posInBuffer++;

      bitmapDataRow = bitmapDataRow << 2;
    }
  }
}

Very similar to text mode multi color, but with subtle differences.

First, compare the assignment of the variable bitmapDataRow between the two modes. Multi color text mode uses character code whereas multi color bitmap mode uses the position in character memory as index.

The definition of the 4 color tablet is also different:


  • Entry 0: Background color
  • Entry 1: Upper 4 bits of character code in screen memory
  • Entry 2: Lower 4 bits of character code in screen memory
  • Entry 3: Color code in Color RAM for character.

Implementing Raster Interrupts

In order for the game Dan Dare to function properly within our emulator, we need to implement raster interrupts.

The basics of implementing raster interrupts within our emulator boils down to raising a flag when we hit a particular line number. So, we will start to implement this basic functionality within video.c:

...
jchar vic_interrupt = 0;
...
int raster_int_enabled() {
  return (memory_unclaimed_io_read(0xd01a) & 1) ? 1 : 0;
}


void video_line_expired(struct timer_struct *tdev) {
  tdev->remainingCycles = 63;
  processLine();
  line_count++;
  jchar RST_0_7 = memory_unclaimed_io_read(0xd012);
  jchar RST_8 = (memory_unclaimed_io_read(0xd011) & 0x80) << 1;
  int targetRasterLine = RST_8 | RST_0_7;
  if (line_count > 310) {
    line_count = 0;
    frameFinished = 1;
    posInBuffer = 0;
  }

    if ((targetRasterLine == line_count) && raster_int_enabled())
      vic_interrupt = vic_interrupt | 1 | 128;

}


We get the target raster line looking at location D012 and bit 7 of location D012. When an interrupt is triggered we set both bit 7 and bit 0 of vic_interrupt.

The next step, is to actually trigger an interrupt on our CPU with a raster interrupt. For this we first need to create a helper method within video.c:

int vic_raster_int_occured() {
  return (vic_interrupt > 128) ? 1 : 0;
}

We next need to invoke this method within cpu.c:

...
  void process_interrupts() {
    if (interruptFlag == 1)
      return;
    if ((trigger_irq() == 0) && (vic_raster_int_occured() == 0))
      return;
    pushWord(pc);
    breakFlag = 0;
    Push(getStatusFlagsAsByte());
    breakFlag = 1;
    interruptFlag = 1;
    int tempVal = memory_read(0xffff) * 256;
    tempVal = tempVal + memory_read(0xfffe);
    pc = tempVal;

  }
...

Next up, we need to interface memory.c with the raster interrupt functionality. For this, we need to add the following helper methods within video.c:

int read_vic_int_reg () {
  return vic_interrupt;
}

void write_vic_int_reg(jchar value) {
  value = ~value & 0x7f;
  vic_interrupt = vic_interrupt & value;
  if (vic_interrupt > 0)
    vic_interrupt = vic_interrupt | 128;
}

We have one method for reading the interrupt status register and another one for writing to it.

Writing to this register is provided for to clear/acknowledge interrupts.

Admitted, the way the VIC-II has implemented acknowledging interrupts is quite strange. In a CIA chip, for example, an interrupt is acknowledge simply by reading the Interrupt status register. In the VIC-II chip, however, you need to write a one to the interrupt bit in order to clear it.

I have implemented this interrupt acknowledgement mechanism with write_vic_int_reg. Basically I invert the value provided and then use the result to mask off the applicable bits.

Finally, we need to wire the above mentioned methods within memory.c:

...
jchar memory_read(int address) {
  if ((address >=0xa000) && (address < 0xc000) && basicROMEnabled())
    return basicROM[address & 0x1fff];
  else if ((address >=0xe000) && (address < 0x10000) && kernalROMEnabled())
    return kernalROM[address & 0x1fff];
  else if (address == 1)
    return read_port_1();
  else if ((address >=0xd000) && (address < 0xe000) && IOEnabled()) {
    if ((address >=0xdc00) && (address < 0xdc10))
      return cia1_read(address);
    else if (address == 0xd011) {
      int tempValue = IOUnclaimed[address & 0xfff] & 0x7f;
      tempValue = tempValue | ((line_count & 0x100) >> 1);
      return tempValue;
    }
    else if (address == 0xd012)
      return line_count & 0xff;
    else if (address == 0xd019)
      return read_vic_int_reg();
    else
      return IOUnclaimed[address & 0xfff];
  }
  else
    return mainMem[address];
}
...
void memory_write(int address, jchar value) {
  //if (((address >= 0xa000) && (address < 0xc000)) |
  //     ((address >= 0xe000) && (address < 0x10000)))
  //  return;

  if (address == 1)
    write_port_1(value);
  else if ((address >=0xd000) && (address < 0xe000) && IOEnabled()) {
    if((address >=0xdc00) & (address < 0xdc10))
      cia1_write(address, value);
    else if (address == 0xd019)
      write_vic_int_reg(value);
    else
      IOUnclaimed[address & 0xfff] = value;
  }
  else
    mainMem[address] = value;
}
...

A Test Run

With everything implemented, I have taken a couple of screenshots:





As you can, our game characters are still invisible. That is because they are sprites, which is functionality which we haven't implemented yet.

In Summary

In this post we have implemented the remaining Graphics modes required in order to render the game Dan Dare.

We have also implemented raster interrupt emulation and the VIC memory model.

In the next post we will be implementing sprite emulation.

Till next time!


Wednesday 7 December 2016

Part 15: Emulating the loading screen

Foreword

In the previous post we have implemented Tape emulation.

So what are we going to do in this post?

Well, up to now we have been using the KERNEL ROM and BASIC ROM as baseline for evolving our emulator.

For the rest of these posts, I will be using a tape image of the game Dan Dare as the next baseline for evolving our Android emulator.

When the game Dan Dare loads from tape, it shows some visual effects that include flashing borders and a splash screen.

So, as a next step we try and see if we can emulate this flashing borders and splash screen. There will however be a bit of work ahead of us to get to this point.

So, let us start!

Introduction to scanline rendering

The flashing border effect is achieved by the loader by changing the border colors many times while the frame is drawn to the screen in a scan-like fashion.

So, in order to get the flashing border effect in our emulator, we will need to implement a scan like rendering for our emulator. Currently this is not the case. We execute a frame's worth amount of instruction cycles and after that we just render the screen one shot. With this type of rendering you you will always see a solid background.

How do we implement scan-like rendering? With this kind of rendering you don't leave rendering a frame for the last moment when you have executed a whole frame's worth of cycles. Rather, you would do rendering when you have executed a lines worth of cycles. On a C64 a line's worth of cycles is 63 clock cycles.

Interesting enough, in my series on writing a Javascript C64 emulator, we used rendering scheme that was more granular than units of scanline. In that emulator we actually rendered pixels after each 6502 instruction execution! This type of rendering, by the way, is called cycle based rendering.

Cycle based rendering is more complicated than scan line based rendering in many respects.  in fact, I am not sure if an Android mobile device will have enough horsepower to deal with cycle based rendering in real time.

So, with our Android C64 emulator I will stick with scan line based rendering.

Creating a new timer

In effect, we want a a rendering action to kick off each about scan line worth of instruction cycles has passed by.

From the previous section we know that each scan line is worth 63 6502 cycles on the C64.

So, we need to schedule a task that gets executed every 63 cycles.

We already have a mechanism in place for timers that we are currently using for CIA timers and for Tape emulation. So, it should be fairly easy for for us just to add another timer to the equation.

First we need to create a file called video.c. Within this file we define the following:

void video_line_expired(struct timer_struct *tdev) {
  tdev->remainingCycles = 63;
}

struct timer_struct getVideoInstance() {
  struct timer_struct myVideo;
  myVideo.expiredevent = &video_line_expired;
  myVideo.remainingCycles = 63;
  myVideo.started = 1;
  return myVideo;
}


We have defined a function that will a timer_struct instance for video rendering. This timer will expire after 63 clock cycles. When the expire is invoked (e.g. the function video_line_expired), remainingCycles will be reset to 63 clock cycles.

Finally, we need to ensure that our new timer gets add to the list of timers that gets processed after each instruction execution:

void
Java_com_johan_emulator_engine_Emu6502_memoryInit(JNIEnv* pEnv, jobject pObj)
{
  timerA = getTimerInstanceA();
  add_timer_to_list(&timerA);
  timerB = getTimerInstanceB();
  add_timer_to_list(&timerB);
  tape_timer = getTapeInstance();
  add_timer_to_list(&tape_timer);
  video_timer = getVideoInstance();
  add_timer_to_list(&video_timer);
}

Defining the Color Tablet

In my JavaScript C64 Emulator Series, here, I have defined a 16 element array containing the C64 color tablet in RGB values.

We can use this array declaration as is in our Android Emulator, with some minor syntax tweaks:

jchar colors_RGB_888[16][3] = {
{0, 0, 0},
                  {255, 255, 255},
                  {136, 0, 0},
                  {170, 255, 238},
                  {204, 68, 204},
                  {0, 204, 85},
                  {0, 0, 170},
                  {238, 238, 119},
                  {221, 136, 85},
                  {102, 68, 0},
                  {255, 119, 119},
                  {51, 51, 51},
                  {119, 119, 119},
                  {170, 255, 102},
                  {0, 136, 255},
                  {187, 187, 187}
};

A thing to keep in mind, however, is that the bitmap we are writing to is not of format RGB_888, but RGB_565, which is 5 bits for the Red channel, 6 bits for the Green channel and another 5 bits for the Blue channel. Our Bitmap therefore uses 2 bytes per pixel instead of the three bytes per pixel that our palette consists out of.

We could change our 16 color palette declaration by hand to RGB_565. However, this is prone to errors. Instead I am going to do conversion in code at start up and populate a new array:

...
jchar colors_RGB_565[16]; 
...
void initialise_video() {
  int i;
  for (i=0; i < 16; i++) {
    int red = colors_RGB_888[i][0] >> 3;
    int green = colors_RGB_888[i][1] >> 2;
    int blue = colors_RGB_888[i][2] >> 3;
    colors_RGB_565[i] =  (red << 11) | (green << 5) | (blue << 0);
  }
}
...

One should now just ensure that initialse_video() gets called as part of the initialisation process.

Defining Process Flow

Let us define the general process flow when we process a scan line.

This is more or less defined by the following bits of code within video.c:

...
int line_count = 0;
...
extern int frameFinished;
...
void video_line_expired(struct timer_struct *tdev) {
  tdev->remainingCycles = 63;
  processLine();
  line_count++;
  if (line_count > 310) {
    line_count = 0;
    frameFinished = 1;
  }
}

All processing for the current line will be performed within the processLine() function, which we will cover later.

We also keep track of the number of lines we have already processed within the frame. As soon as we reach line 311 we wrap back to zero.

In effect, we also need to keep track of the number of lines so that we know when we are finished with the current frame.

Previously the runBatch method within cpu.c kept an eye on when we were finished with the current frame by looping through exactly 20000 cycles. This will need to change, since we need to sync our runbatch method to the exact moment when video.c is finished rendering  the frame. If we don't do that, we might end off writing frames to the screen containing some residue of the previous frame.

The frameFinished variable will help us with this syncing. As you can see, frameFinished is declared with extern, which means that we have physically declared this variable in another file. In fact, I have declared this variable within cpu.c.

Our modified runBatch method will look as follows:

...
int frameFinished = 0;
...
int runBatch(int address) {
  //remainingCycles = 20000;
  frameFinished = 0;
  int lastResult = 0;
  while ((!frameFinished) && (lastResult == 0)) {
    lastResult = step();
    if (lastResult != 0)
      break;
    if ((address > 0) && (pc == address)) {
      lastResult = -1;
      break;
    }
    processAlarms();
    
  }

  return lastResult;
}


Drawing a scan line

Let us now discuss the process of drawing a scanline. This is performed within the function processLine():

static inline void processLine() {
  if (line_count > 299)
    return;

  updatelineCharPos();
  fillColor(24, memory_read(0xd020) & 0xf);
  int screenEnabled = (memory_read(0xd011) & 0x10) ? 1 : 0;
  if (screenLineRegion && screenEnabled) {
    drawScreenLine();
  } else {
    fillColor(320, memory_read(0xd020) & 0xf);
  }
  fillColor(24, memory_read(0xd020) & 0xf);
}


Firstly, you will see that if the line number is bigger than 299 we exit the processLine function all together. This is because 10 of 312 lines is applicable during vertical blanking, that is they are not displayed at all.

However, we still need to account for these lines to get us closer to true emulation speed.

The updatelineCharPos() ensures that we always have an up to date pointer to the beginning of the current character line within screen character memory with which we are currently busy with.

The process of drawing a line is basically as follows:

  1. Draw left border
  2. Draw a 320 pixel line of main screen
  3. Draw right border
Step 2 can potentially also be drawn entirely drawn with the border color in the following cases:
  • We are currently in the top or bottom border area.
  • The screen is currently disabled, that is bit#4 of location d011 is set to zero.
We use the fillColor method to fill a line segment with the border color. The method looks like this:

inline void fillColor(int count, int colorEntryNumber) {
  int currentPos;
  for (currentPos = 0; currentPos < count; currentPos++) {
    g_buffer[posInBuffer] = colors_RGB_565[colorEntryNumber];
    posInBuffer++;
  }
}

Drawing of the main screen line is performed by drawScreenLine():

static inline void drawScreenLine() {
  int i;
  for (i = 0; i < 40; i++) {
    jchar charcode = memory_read(1024 + i + posInCharMem);
    int bitmapDataRow = charRom[(charcode << 3) | (line_in_visible & 7)];
    int j;
    int foregroundColor = memory_read(0xd800 + i + posInCharMem) & 0xf;
    int backgroundColor = memory_read(0xd021) & 0xf;
    for (j = 0; j < 8; j++) {
      int pixelSet = bitmapDataRow & 0x80;
      if (pixelSet) {
        g_buffer[posInBuffer] = colors_RGB_565[foregroundColor];
      } else {
        g_buffer[posInBuffer] = colors_RGB_565[backgroundColor];
      }
      posInBuffer++;
      bitmapDataRow = bitmapDataRow << 1;
    }
  }
}

We basically have a main loop where we loop through the characters in screen memory for the current screen character row. We fetch the character code for each character and lookup a 8 pixel line of bitmap data from the character ROM for each character.

We then continue and draw each pixel of the 8-pixel line either in the foreground color if it set, otherwise in the background color.

Testing and Debugging

When I took the code changes done in this post for a test drive a couple of bugs surfaced.

Since I am basing this series of blog posts on my previous series on a JavaScript emulator, all this bugs looks kind of familiar :-) We can therefore leverage from past leanings and avoid some painful debugging exercises.

In the process of loading the game from the tape image, I experienced two major bugs. Both these bugs was related to C64 bank switching not being implemented.

So, let us quickly spend some time implementing bank switching. I am going implement bank switching for KENREL ROM, BASIC ROM and the IO area.

As you might have known, bank switching is implemented via the lower three bits of memory location 1. The combinations you need to use for enabling the various banking configurations is not very intuitive, leave alone writing understandable emulation code for these configurations.

To simplify our world a bit, we can create a lookup table accepting the lower three bits of memory location 1 as a parameter and returning a flag byte. This flag byte will then contain a bit for BASIC ROM, KERNEL ROM, IO and CHARROM. Each bit will indicate which of previously mentioned regions are visible.

We add the following code to memory.c:

#define BASIC_VISIBLE 1
#define KERNAL_VISIBLE 2
#define CHAR_ROM_VISIBLE 4
#define IO_VISIBLE 8

int bank_visibility[8] =
{
  0,//000
  CHAR_ROM_VISIBLE,//001
  CHAR_ROM_VISIBLE | KERNAL_VISIBLE,//010
  BASIC_VISIBLE | KERNAL_VISIBLE | CHAR_ROM_VISIBLE,//011
  0,//100
  IO_VISIBLE,//101
  IO_VISIBLE | KERNAL_VISIBLE,//110
  BASIC_VISIBLE | KERNAL_VISIBLE | IO_VISIBLE//111
};
...
inline int kernalROMEnabled() {
  int bankBits = mainMem[1] & 7;
  return (bank_visibility[bankBits] & KERNAL_VISIBLE) ? 1 : 0;
}

inline int basicROMEnabled() {
  int bankBits = mainMem[1] & 7;
  return (bank_visibility[bankBits] & BASIC_VISIBLE) ? 1 : 0;
}

inline int IOEnabled() {
  int bankBits = mainMem[1] & 7;
  return (bank_visibility[bankBits] & IO_VISIBLE) ? 1 : 0;
}


We now have some helper methods that tells us which banks are enabled.

W now change memory_read and memory_write as follows:

...
jchar IOUnclaimed[4096];
...
jchar memory_read(int address) {
  if ((address >=0xa000) && (address < 0xc000) && basicROMEnabled())
    return basicROM[address & 0x1fff];
  else if ((address >=0xe000) && (address < 0x10000) && kernalROMEnabled())
    return kernalROM[address & 0x1fff];
  else if (address == 1)
    return read_port_1();
  else if ((address >=0xd000) && (address < 0xe000) && IOEnabled()) {
    if ((address >=0xdc00) && (address < 0xdc10))
      return cia1_read(address);
    else
      return IOUnclaimed[address & 0xfff];
  }
  else
    return mainMem[address];
}

void memory_write(int address, jchar value) {
  //if (((address >= 0xa000) && (address < 0xc000)) |
  //     ((address >= 0xe000) && (address < 0x10000)))
  //  return;

  if (address == 1)
    write_port_1(value);
  else if ((address >=0xd000) && (address < 0xe000) && IOEnabled()) {
    if((address >=0xdc00) & (address < 0xdc10))
      cia1_write(address, value);
    else
      IOUnclaimed[address & 0xfff] = value;
  }
  else
    mainMem[address] = value;
}

We have defined the array IOUnclaimed for reads and writes when the IO region is enabled.

With all these change applied, let us look at some screenshots of the emulator loading the game:






The last screen appears a bit garbled because we haven't implemented the full VIC-II memory model yet. We will tackle this in the next post.


Performance considerations

With all this scan line Rendering code that we wrote, I was curious to know what performance penalty this functionality caused. This is quite a general concern when developing software for a mobile device because of resource limitations.

I got hold of the code we developed in the post Part 12: Emulating the keyboard and use it to do some baseline benchmarks.

In this post we still did frame rendering after we have executed a frame worth of CPU cycles. So I measured time separately for running runBatch() and running populateFrame.

On average the measured revealed between 1 and 2 milliseconds for running runBatch, and less than a millisecond for running populateFrame.

With baseline in mind, A ran benchmark for the code developed in this post. The rendering functionality developed in this post is woven in between executing cpu code, so it is not feasible to get two separate timings for cpu execution and rendering per frame. So, for this post I only retrieved a single average timing.

This timing value was a bit of a disappointment. The average time per frame, which include cpu execution and rendering, was between 6 and 7 milliseconds.

I thought of ways to reduce this time, and all that I could think of for the moment was to batch together some of memory read requests requested by the rendering code together in a big batch.

This batching happens within the drawScreenline method of video.c:
static inline void drawScreenLine() {
  int i;
  int batchCharMem[40];
  int batchColorMem[40];
  int backgroundColor = memory_unclaimed_io_read(0xd021) & 0xf;
  memory_read_batch(batchCharMem, 1024 + posInCharMem, 40);
  memory_read_batch_io_unclaimed(batchColorMem, 0xd800 + posInCharMem, 40);
  for (i = 0; i < 40; i++) {
    jchar charcode = batchCharMem[i];//memory_read(1024 + i + posInCharMem);
    int bitmapDataRow = charRom[(charcode << 3) | (line_in_visible & 7)];
    int j;
    int foregroundColor = batchColorMem[i] & 0xf;//memory_read(0xd800 + i + posInCharMem) & 0xf;

    for (j = 0; j < 8; j++) {
      int pixelSet = bitmapDataRow & 0x80;
      if (pixelSet) {
        g_buffer[posInBuffer] = colors_RGB_565[foregroundColor];
      } else {
        g_buffer[posInBuffer] = colors_RGB_565[backgroundColor];
      }
      posInBuffer++;
      bitmapDataRow = bitmapDataRow << 1;
    }
  }
}

The implementation of the two batch methods, looks as follows:

void memory_read_batch(int *batch, int address, int count) {
  int i;
  for (i = 0; i < count; i++) {
    batch[i] = mainMem[address + i];
  }
}

void memory_read_batch_io_unclaimed(int *batch, int address, int count) {
  int i;
  address = address & 0xfff;
  for (i = 0; i < count; i++) {
    batch[i] = IOUnclaimed[address + i];
  }
}

Implementing above mentioned code shaved off about a millisecond off the total time per frame.

In Summary

In this post we have implemented color graphics and scan line based rendering.

We have also implemented bank switching of the C64.

In the next post, we will see how far we can get with implementing the other graphic modes of the C64 in order to get the game Dan Dare in a playable state in our emulator.

Till next time!


Thursday 1 December 2016

Part 14: Implementing Tape Emulation

Foreword

In the previous post we implemented the complete emulation of timers of CIA#1. This is mandatory in order to implement Tape emulation.

In this post we will be finishing off the implementation of tape emulation. Apart from the technical emulation aspects of tape loading, we will also be implementing a file browser you will use to browse to the .tap file on your mobile device that you want to load.

As an added bonus, I will also be showing how you would go about enlarge the video output displayed. As you know the C64 screen has a resolution of 320x200 pixels. This can result in quite a small block on a high resolution screen. To counteract this inconvenience, I will also be showing in this post how to scale this image so it is displays larger.

Implementing a File Browser

My first step was to implement a file browser allowing you to browse to a .tap file on your device that you want to attach.

Luckily I didn't need to re-invent the wheel here. I found a very nice tutorial on the net showing you how to create an Android File Browser together with some source code:

http://custom-android-dn.blogspot.co.za/2013/01/create-simple-file-explore-in-android.html

I ended up modifying the code a bit so that it fits within our application.

I am not going to go into a lot of detail on how to create the create the file browser. I will, however, highlight a couple points of importance.

The file browser consists out of two screens:



So,basically if you know the path to the file you can just type it in the first screen and tap attach. Otherwise just just tap on browser and browse to the file.

Needless to say, each of above the two screens is an activity in its won right. The class name of the first screen is FileDialogueActivity.class and that of the second screen is FileChooser.class.

Something interesting I want to highlight is that there is parameter passing between FileDialogueActivity.class and FileChooser.class. This is functionality you would quite often use when writing an Android application. Like in our case, you would launch FileChooser from FileDialogueActivity so that the user can choose a file and when that activity returns, you will want to know which file was chosen.

Let us look at some code for this functionality. When you tap on Browser button in FileDialogueActivity, the following method will execute:

    public void getfile(View view){
        Intent intent1 = new Intent(this, FileChooser.class);
        startActivityForResult(intent1,REQUEST_PATH);
    }

This code looks close to the conventional way we have used up to know for invoking a new activity. But, with a subtle difference. The activity FileChooser is initiated with the method call startActivityForResult. This will the launch FileChooser, but indicates that we expects a result back. This will become clear in a moment.

Now, at the point when FileChooser is about to return, that is, a file was chosen, the following method will execute:

    private void onFileClick(Item o)
    {
        //Toast.makeText(this, "Folder Clicked: "+ currentDir, Toast.LENGTH_SHORT).show();
        Intent intent = new Intent();
        intent.putExtra("GetPath",currentDir.toString());
        intent.putExtra("GetFileName",o.getName());
        setResult(RESULT_OK, intent);
        finish();
    }

As you can see, we are creating an Intent instance for the purpose of passing back values to FileDialogueActivity. The values of interest are GetPath, GetFileName and finally the result code, which is RESULT_OK when all went well.

Finally, we are calling finish() that will cause us to move back to FileDialogueActivity. With FileDialogueActivity coming back to live, we need to know which file was choosen within FileChooser. To obtain this knowledge we need to add a method to FileDialogueActivity called onActivityResult:

    protected void onActivityResult(int requestCode, int resultCode, Intent data){
        // See which child activity is calling us back.
        if (requestCode == REQUEST_PATH){
            if (resultCode == RESULT_OK) {
                curFileName = data.getStringExtra("GetPath") + "/" +data.getStringExtra("GetFileName");
                edittext.setText(curFileName);
            }
        }
    }

This is a method that we override from the base class Activity. As one of the parameters we receive the Intent instance that we previously created and populated with parameters.

Subsequently we can retrieve the parameters from the Intent instance and do something useful with it. In out case we create an absolute path and show it in the edit box of the File dialogue box.

Next, we should implement similar parameter passing between FrontActivity (e.g. the main activity of our emulator) and FileDialogueActivity. We will cover this in the next section.

Loading a tape image into Memory

In the previous section we have seen how to implement a file browser within an Android application.

This whole exercise was just to the absolute path to tape image.

In this section we will be interfacing our FrontActivity to the file browser in order to hold of the absolute path to a tape image file and then load this file into memory.

We start off by adding an extra menu item for browsing for a tape image and implement an event handler for it within the FrontActivity:

    @Override
    public boolean onOptionsItemSelected(MenuItem item) {
        // Handle action bar item clicks here. The action bar will
        // automatically handle clicks on the Home/Up button, so long
        // as you specify a parent activity in AndroidManifest.xml.
        int id = item.getItemId();
        //noinspection SimplifiableIfStatement
        if (id == R.id.action_stop) {
            switchToDebug = true;
            return true;
        } else if (id == R.id.action_attach) {
            Intent i = new Intent(this, FileDialogueActivity.class);
            startActivityForResult(i, 1);

            return true;
        }

        return super.onOptionsItemSelected(item);
    }


I will cover startActivityFoResult in a moment. Let us first have a look at what happens when you click the Attach button at which you will be be effectively leaving the File Dialogue:

    public void onAttachClick(View v) {
        Intent intent = new Intent();
        intent.putExtra("GetFullPath",curFileName);
        setResult(RESULT_OK, intent);
        finish();

    }

As you see, we are passing the full path back to our FrontActivity.

Now, let us have a look at the implementation of startActivityFoResult within FrontActivity:

    protected void onActivityResult(int requestCode, int resultCode, Intent data){
        // See which child activity is calling us back.
        if (requestCode == 1){
            if (resultCode == RESULT_OK) {
                String curFileName = data.getStringExtra("GetFullPath");
                ...
            }
        }
    }

At this point in our code, we have a handle on the absolute path of the requested tape image file.

The question now is, what do we do with this absolute path?

The quick and easy answer is just open this file and load it into memory. One might be thinking that this is quite a waste. Shouldn't we rather just open the Tape image and read it bit by bit as required when the C64 system is loading the game from tape?

I have actually being pondering about this question for some time. However, in general these tape images is less than a megabyte and I don't think the price is too big to pay if you load the whole tape image into memory one shot.

To store the tape image in memory, we will again use a ByteBuffer because that will allow our native code to directly access the data from the buffer.

Here is the full code to load the tape image into memory:

...
    private ByteBuffer mTape;
...
    protected void onActivityResult(int requestCode, int resultCode, Intent data){
        // See which child activity is calling us back.
        if (requestCode == 1){
            if (resultCode == RESULT_OK) {
                String curFileName = data.getStringExtra("GetFullPath");
                try {

                    RandomAccessFile file = new RandomAccessFile(curFileName, "r");
                    FileChannel inChannel = file.getChannel();
                    long fileSize = inChannel.size();
                    mTape = ByteBuffer.allocateDirect((int) fileSize);
                    inChannel.read(mTape);
                    mTape.rewind();
                    inChannel.close();
                    file.close();
                    emuInstance.attachNewTape(mTape);
                } catch (Exception e) {
                    e.printStackTrace();
                }
            }
        }
    }

For some people the way I did file IO might look a bit weird. The reason why I decided to do it this way is because FileInputStream doesn't support reading file contents to a ByteBuffer object. So, instead I used the File I/O wrapped within the same package in which find ByteBuffer: java.nio. NIO, by the way, stands for New IO.

In the line emuInstance.attachNewTape() we actually pass our ByteBuffer to our native code. We will cover this in the next section.

Going Native

Now it is time to implement the native part of the code for the Tape emulation.

The Tape Emulation functionality can be thought of as a glorified timer, triggering interrupts at rates specified within the Tape image file.

The bulk of the tape emulation functionality will be specified within a new file, tape.c. We start off by defining a function within this file for returning a timer_struct:

struct timer_struct getTapeInstance() {
  struct timer_struct mytape;
  mytape.expiredevent = &tape_pulse_expired;
  mytape.remainingCycles = 0x0;
  mytape.started = 0;
  mytape.interrupt = &interrupt_flag;
  return mytape;
}

We will discuss the implementation of the methods tape_pulse_expired and interrupt_flag in a moment.

Like with our CIA timers, we will be instantiating a tape instance within memory.c:

...
struct timer_struct tape_timer;
...
void
Java_com_johan_emulator_engine_Emu6502_memoryInit(JNIEnv* pEnv, jobject pObj)
{
  timerA = getTimerInstanceA();
  add_timer_to_list(&timerA);
  timerB = getTimerInstanceB();
  add_timer_to_list(&timerB);
  tape_timer = getTapeInstance();
  add_timer_to_list(&tape_timer);
}
...

We create a tape instance, store it as a global instance in memory.c and we add it to a list of timers to be processed at regular intervals.

Let us next implement the functionality for storing the tape image pushed through by our Java Code. A bit of a snag here. One of the key things that should happen when we store the tape image, is to initialise  tape_timer->remainingCycles with the first value of the tape image.

We can therefore not blindly send the buffer to tape.c since tape.c doesn't have a handle on tape_timer. We need to send the buffer via memory.c to tape.c.

So, within memory.c, add the following method:

void Java_com_johan_emulator_engine_Emu6502_attachNewTape(JNIEnv* pEnv, jobject pObj, jobject oBuf) {
  jbyte * tape_image = (jbyte *) (*pEnv)->GetDirectBufferAddress(pEnv, oBuf);
  attachNewTape(tape_image, &tape_timer);
}


We implement attachNewTape within tape.c:

...
jbyte* tape_image;
int posInTape;
...
void attachNewTape(jbyte* buffer, struct timer_struct *tdev) {
  tape_image = buffer;
  posInTape = 0x14;
... 
}

It should be noted that real data only starts at offset 0x14 within a tape image. Hence, when we store a new tape image, we immediately set the position within the array to 0x14.

At this point it maybe a good idea just to recap on the format of a .tap file. A .tap file is basically a list of pulse widths with units in number of cpu cycles. Let us say, for instance, we derive the following numbers from a tape image file:

400
300
500

This would induce the following sequence:


  • After 400 cpu clock cycles interrupt the cpu
  • After 300 cpu clock cycles interrupt the cpu 
  • After 500 cpu clock cycles interrupt the cpu
Within a .tap file there is also happening some amount of compression to reduce the size of the file. Each delay value is stored as a unit of 8 clock cycles. Without this scheme most delay values will take up 2 bytes. With the compression scheme it storage requirements shrink to one byte for the majority of delay samples.

The .tap format, however, do make provision for longer delays. Such a sample should start with the value zero. If a sample with value zero is encountered, the actual value will be contained in the next three bytes in a the low/high format. Therefore, with such a sample you can represent a value of up to 24 bits, equalling 16777216 clock cycles.

That is right, more than 16Million clock cycles! On a normal 1MHz 6502 CPU this would equal a pulse having a width round about 16 seconds. I doubt that you would ever encounter a C64 tape containing such a long pulse. Granted, if you have data tapes that pauses between data blocks, you might pulses around 1/4 of second when the tape motor speeds. 1/4 of a second would result in 250k clock cycles, for which you would need three bytes to represent.

Let us now get back to some coding. With our knowledge of the .tap format, we can now make the following adjustments to tape.c:

void update_remaining(struct timer_struct *tdev) {
  int temp = tape_image[posInTape];

  if (temp != 0) {
    tdev->remainingCycles = temp << 3;
    posInTape++;
  } else {
    tdev->remainingCycles = tape_image[posInTape + 1] | (tape_image[posInTape + 2] << 8) | (tape_image[posInTape + 3] << 16);
    posInTape = posInTape + 4;
  }
}

void attachNewTape(jbyte* buffer, struct timer_struct *tdev) {
  tape_image = buffer;
  posInTape = 0x14;
  update_remaining(tdev);
}

void tape_pulse_expired(struct timer_struct *tdev) {
  update_remaining(tdev);
  interrupt_flag();
}


As you can see, when we attach a new tape image, we also initialise the remainingCycles member of the tape instance.

You will also see that I have implemented the method tape_pulse_expired method mentioned earlier. Within this method we invoke an interrupt and advance to the next sample.

interrupt_flag is a method that we still need to implement within interrupt.c, so let us do that quickly:

void interrupt_flag() {
  interrupts_occured = interrupts_occured | 16;
}

The method name interrupt_flag might be a bit confusing. So, let me try and clear some possible confusion.

The cassette read line is connected to the FLAG interrupt pin of the CIA#1. From there the name interrupt_flag(). The interrupt pin is represented by bit#4 within the interrupts register. For that reason we are OR'ing interrupts_occured with 16 when this interrupt has occurred.

We have basically implemented the important bits of tape emulation. Once our tape_timer struct gets in the started state, our tape emulation code should work! The catch 22 here is though, in the current state there is no way to get the  tape_timer struct into the started state.

There is a number of things we still need to implement in order to get the tape_timer struct into the started state.

The first thing is emulating Tape Sense. What is Tape sense? Well, basically when you get the prompt PRESS PLAY ON TAPE, the Kernal keeps looking at Bit 4 of memory location 1 to see if you did indeed pressed the Play button.

There is a couple of things we need to do in order for the Tape sense emulation to work. First thing is to add an extra menu item to the FrontActivity menu. When the user sees the message PRES PLAY ON TAPE, he/she can just select this menu option to simulate pressing play on tape.

We also need to implement some of the Tape sense functionality within tape.c:

...
int playDown = 0;
...
void Java_com_johan_emulator_engine_Emu6502_togglePlay() {
  playDown = !playDown;
}
...
int isPlayDownBit() {
  return (playDown != 0) ? 0 : 1;
}
...

Each the user tap on the play menu item, togglePlay should be invoked.

Memory.c will call isPlayDownBit when it needs to  know the state of bit 4 of memory location 1 (e.g. datasettte button status).

Next thing we should consider for tape emulation is motor control. The kernal is in charge of switching the cassette motor on or off. It will generally switch the motor on when it detects that the play button is down.

When the Kernal wishes to switch on the motor, it does so by setting bit 5 of memory location to zero. This bit should also be the queue for tape.c to start the tap timer.

For motor control, we implement the following methods within tape.c:

void setMotorOn(struct timer_struct *tdev, int motorBit) {
  tdev -> started = (motorBit == 0) ? 1 : 0;
}

int getMotorOnBit(struct timer_struct *tdev) {
  return (tdev -> started == 1) ? 0 : 1;
}

Finally, let us do memory integration for tape sense and motor control. This will involve the following changes to memory.c:

jchar read_port_1() {
  jchar result = mainMem[1] & 0xcf;
  result = result | (getMotorOnBit(&tape_timer) << 5);
  result = result | (isPlayDownBit() << 4);
  return result;
}

void write_port_1(jchar value) {
  mainMem[1] = value;

  int motorStatus = (value & (1 << 5)) >> 5;
  setMotorOn(&tape_timer, motorStatus);
}


jchar memory_read(int address) {
  if (address == 1)
    return read_port_1();
  else if ((address >=0xdc00) & (address < 0xdc10))
    return cia1_read(address);
  else
    return mainMem[address];
}

void memory_write(int address, jchar value) {
  if (((address >= 0xa000) && (address < 0xc000)) |
       ((address >= 0xe000) && (address < 0x10000)))
    return;

  if (address == 1)
    write_port_1(value);
  else if ((address >=0xdc00) & (address < 0xdc10))
    cia1_write(address, value);
  else
    mainMem[address] = value;
}

This is it for tape emulation!

Enlarging the screen

Before we test all the code we wrote in this post, I just want to discuss something else.

As mentioned earlier on, the C64 320x200 pixel screen is quite small when displayed on a high resolution screen.

In this section I just want to show how easy it is to enlarge our screen in Android just with a few lines of code.

Firstly we need to adjust the dimensions of our surfaceview in the layout file of our frontactivity:

    <com.johan.emulator.view.C64SurfaceView
        android:id="@+id/Video"
        android:layout_width="640px"
        android:layout_height="400px" />


And next, a small adjust in the code we use render the frames:

                    emuInstance.populateFrame();
                    mByteBuffer.rewind();
                    mBitmap.copyPixelsFromBuffer(mByteBuffer);
                    canvas.save();
                    canvas.scale(1.5f, 1.5f);
                    canvas.setDrawFilter(filter);
                    canvas.drawBitmap(mBitmap,0,0, paint);
                    canvas.restore();
                    holder.unlockCanvasAndPost(canvas);

The canvas performs the scaling by means of a translation matrix, very much the same way OpenGL works with 3D graphics.

canvas.save saves any previously active matrix.

canvas.scale applies a transformation matrix that in effect map the pixels of our 320x200 pixel bitmap to a larger area. When he canvas eventually does the drawing, it apples our transformation matrix.

Finally, canvas.restore reverts to the previous matrix.

Testing

Time that we test all our code changes.

I used again a tape image to test with.


We see that our emulator did in fact find Dan Dare. So our Tape emulation works!

We are, however, not in a postion at he moment to see the flashing borders and splash screen that gets shown while the game loads. We will tackle this in the next post.

In Summary

In this chapter we have implemented tape emulation.

We have also enlarged the output of our screen.

In the next post we will be jacking up our video output functionality. This is so that we can view the flashing borders and splash screen while the game loads.

Till next time!