Ascend by Cluster & DMA :: nfo

Ascend by Cluster & DMA [web]

Ascend, a Atari VCS 2600 demo for SillyVenture 2014 by Cluster and DMA.

Atari VCS 2600, no extra RAM (128 bytes only), 32k ROM.

Code and design:  Kylearan/Cluster (andre.wichmann@gmx.de)
Music and player: KK/DMA (kk@devkk.net)

After TIM1T (http://www.pouet.net/prod.php?which=62944), where I learned
the architecture of the VCS 2600, I wanted to push the limits of the
machine a bit. There are way too few 32k demos for the VCS and there's
still a lot possible for this weird little machine.

Big thanks to KK, who not only made the music and the replay routine for
both demos, but who also let me try out his compiler framework. It evolved
some during development of this demo and sometimes drove me crazy, but
it's still a huge convenience to be able to use control structures or
write multiple instructions per line.

What follows is a write-up about the different parts of this demo, for
the technically interested.


1) Intro:

It bothered me that demos either have to use blocky playfield graphics
or narrow 48 pixel sprites to display names and titles. Instead, I use
a 48 pixel sprite as a "sliding window" moving in 8 pixel steps and
missiles as blinders to make the movement look smooth. (I cannot move
the graphics in the 48 pixel sprite itself like a scroller because the
logos are too high and won't fit into RAM).

I also move and set the playfield accordingly so I can set colors on the
logos - I cannot change the color of the sprites directly, as the missile
blinders would be colored as well and no longer be invisible.

Oh boy, big logos eat a lot of ROM!


2) Title:

Here I create the illusion of a 140 pixel wide hi-res graphics by using
a 48 pixel sprite and move it left three pixels every two lines. What
sounded easy on paper turned out to be very complicated in practice. In
fact, it took four different consecutive loops, one of them in RAM and
one unrolled, to accomplish that.

Related to that, here's a small riddle. Tell me what this loop in RAM
does and why, and you'll get a virtual bonus point! :-P

	lda #0
	sta var
loop:
	sta WSYNC
	sta HMOVE
	ldy var
var = *+1
	bne *
	dc.b $2c
	dc.b $c9
	dc.b $c9
	dc.b $c9
	dc.b $c9
	dc.b $c9
	dc.b $c9
	dc.b $2c
	dc.b $24
	dc.b $60
	[...48 pixel display routine using y as index...]
	inc var
	ldy var
	[...second line 48 pixel display routine...]
	jmp loop

All in all, a lot of ROM was wasted for the graphics and the four loops
to display it. I'm not sure it was worth spending 3K on an essentially
static title screen, but I had lots of fun doing it!


3) Rotocubes:

I was inspired by KK's very nice rotocubes in Ataventure, but what I don't
like about them is their uniform grey color. So I wanted to do my own, more
colorful version, and also tried to go for maximum possible variance in
width.

The result is a two-line kernel which reads from a "frame buffer" in RAM.
Each byte encodes one of four color palettes and the width of the box in
the line (0-56). It sets PF1, PF2, COLUPF and COLUP0 accordingly, and can
reposition P0 and M0 on the same scanline (they have to have a minimum
distance from each other). By alternating a version for P0/M0 and P1/M1
and disabling/enabling the respective objects, I can seamlessly display
different box widths even if they differ by more than HMOVE could handle,
with no black line inbetween.

The boxes itself get then drawn into this "framebuffer" during overscan
and vblank.


4) Lines:

This part contains the most complicated algorithms of the whole demo.
It can display line figures, as long as no more than 4 lines cross the
same scanline (using players and missiles).

The display kernel reads virtual opcodes from RAM. Encoded there are
instructions for a Bresenham-like algorithm incorporating different
pixel sizes (NUSIZ and GRP values). When a line segment ends, the 
direction changes and in some cases, a repositioning of that object is
needed. This happens when a line would have needed a width per scanline
of more than what NUSIZ/GRP and HMOVE can handle. Also, objects can
appear or disappear at the start of end of line segments.

Because starting or ending a new line needs more time than simply
continuing an existing one, the pixels are not of constant height. If
for example a repositioning is needed, all other objects on that line
become one scanline taller. Luckily, it still looks good (IMHO).

Where it becomes complicated is how these virtual opcodes are
constructed. After point coordinates have been computed for the next
frame, they are sorted via bubble sort, relying on the assumption that
they are nearly sorted already because of similar positions last frame.

Next, a linear sweep from top to bottom is made. For each point
encountered, it has to be decided if a line starting from here re-uses
an object from a previous line ending here, or a new object needs
to be started. So I also have to store the object type used for the
end-point. Some creative data packing is needed to store all this
information efficiently in RAM...

More than about 10 line segments are not possible right now, which
is enough for the needs of this demo.


5) Greetings:

Another attempt at providing a hi-res picture wider than 48 pixels.
The face is displayed with missiles and HMOVEs, while the hair, the
eye brow and the hand are created with player objects and the ball.
The concave chin is done by carving with the help of playfield and
ball.

The scroller is displayed via an unrolled loop, because besides
showing an 48 pixel sprite, each scanline I also have to HMOVE M0
and M1, set COLUP0 and COLUP1 to red and back to black after the
scroller, and set COLUPF twice. Some hairy timing involved here...

The hearts are displayed using a classic skipdraw routine. Since
NUSIZ has to be set to three copies close because of the missiles,
I have to hide some hearts under playfield so that not only groups
of three hearts appear.


6) Spinning cube:

This part is the main reason I wanted to do this demo in the first
place, and the kernel and algorithms of the lines part laid the
foundations for this. Alas, this could have been so much more! I
had planned to zoom the camera away and move the cube around a bit
and to display a second object (a rotating 3D arrow flying up), but
time ran out unfortunately. Thus, I took a shortcut, but the
foundations are there.

Each line segment needs 3 virtual opcode bytes in RAM, 4 if a
reposition is needed. Backface culling is mandatory here, as we
cannot have more than 4 lines per scanline. (I also carefully
control rotation to avoid some corner cases.)

In addition, some extra vars are needed for setup. For the cube,
this means one frame needs between 18 and 31 bytes of RAM, making
double buffering possible while still leaving room for point
coordinates, state, music etc.


7) Fireworks:

This is a variation of my starfield kernel from TIM1T, a two-line
kernel being able to reposition M0/M1 independantly. At the same
time, it can set an individual color for each fragment, can display
the ball via skipdraw for the rocket, and switch on/off P0/P1 at
certain lines for the stars. No cycles are wasted here.

The challenge in this part was to stuff all needed information into
RAM. Each rocket fragment has a 16 bit x position, 16 bit x speed,
y position, index into parabola for y movement, color, time to live,
and two indexes used for sorting the fragments by y coordinate (one
used for partially sorting, and one for a sanitized list where no
two fragments have the same y coordinate). That's 10 bytes per
fragment! In addition, I also needed RAM for rocket state and usual
stuff like music and demo state.

So what I did is I had one pseudo random number per rocket, and
derived seeds for random numbers per fragment from that. And from
that, I derived as many parameters as possible instead of storing
them in RAM (time to live, index into parabola, and more). That
way, I was able to keep track of 18 fragments.

I love firework effects and thought this to be a nice ending part.
[ back to the prod ]