-
Notifications
You must be signed in to change notification settings - Fork 9
/
README
5253 lines (4147 loc) · 197 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
64tass v1.59 r3120 reference manual
This is the manual for 64tass, the multi pass optimizing macro assembler for
the 65xx series of processors. Key features:
* Open source portable C with minimal dependencies
* Familiar syntax to Omicron TASS and TASM
* Supports 6502, 65C02, R65C02, W65C02, 65CE02, 65816, DTV, 65EL02, 4510
* Arbitrary-precision integers and bit strings, double precision floating
point numbers
* Character and byte strings, array arithmetic
* Handles UTF-8, UTF-16 and 8 bit RAW encoded source files, Unicode character
strings
* Supports Unicode identifiers with compatibility normalization and optional
case insensitivity
* Built-in `linker' with section support
* Various memory models, binary targets and text output formats (also Hex/
S-record)
* Assembly and label listings available for debugging or exporting
* Conditional compilation, macros, structures, unions, scopes
Contrary how the length of this document suggests 64tass can be used with just
basic 6502 assembly knowledge in simple ways like any other assembler. If some
advanced functionality is needed then this document can serve as a reference.
This is a development version. Features or syntax may change as a result of
corrections in non-backwards compatible ways in some rare cases. It's difficult
to get everything `right' first time.
Project page: https://sourceforge.net/projects/tass64/
The page hosts the latest and older versions with sources and a bug and a
feature request tracker.
-------------------------------------------------------------------------------
Table of Contents
* Table of Contents
* Usage tips
* Expressions and data types
+ Integer constants
+ Bit string constants
+ Floating point constants
+ Character string constants
+ Byte string constants
+ Lists and tuples
+ Dictionaries
+ Code
+ Addressing modes
+ Uninitialized memory
+ Booleans
+ Types
+ Symbols
o Regular symbols
o Local symbols
o Anonymous symbols
o Constant and re-definable symbols
o The star label
+ Built-in functions
o Mathematical functions
o Byte string functions
o Other functions
+ Expressions
o Operators
o Comparison operators
o Bit string extraction operators
o Conditional operators
o Address length forcing
o Compound assignment
o Slicing and indexing
* Compiler directives
+ Controlling the compile offset and program counter
+ Aligning data or code
+ Dumping data
o Storing numeric values
o Storing string values
+ Text encoding
+ Structured data
o Structure
o Union
o Combined use of structures and unions
+ Macros
o Parameter references
o Text references
+ Custom functions
+ Conditional assembly
o If, else if, else
o Switch, case, default
o Comment
+ Repetitions
+ Including files
+ Scopes
+ Sections
+ 65816 related
+ Controlling errors
+ Target
+ Misc
+ Printer control
* Pseudo instructions
+ Aliases
+ Always taken branches
+ Long branches
* Original turbo assembler compatibility
+ How to convert source code for use with 64tass
+ Differences to the original turbo ass macro on the C64
+ Labels
+ Expression evaluation
+ Macros
+ Bugs
* Command line options
+ Output options
+ Operation options
+ Diagnostic options
+ Target selection on command line
+ Symbol listing
+ Assembly listing
+ Other options
+ Command line from file
* Messages
+ Warnings
+ Errors
+ Fatal errors
* Credits
* Default translation and escape sequences
+ Raw 8-bit source
o The none encoding for raw 8-bit
o The screen encoding for raw 8-bit
+ Unicode and ASCII source
o The none encoding for Unicode
o The screen encoding for Unicode
* Opcodes
+ Standard 6502 opcodes
+ 6502 illegal opcodes
+ 65DTV02 opcodes
+ Standard 65C02 opcodes
+ R65C02 opcodes
+ W65C02 opcodes
+ W65816 opcodes
+ 65EL02 opcodes
+ 65CE02 opcodes
+ CSG 4510 opcodes
* Appendix
+ Assembler directives
+ Built-in functions
+ Built-in types
-------------------------------------------------------------------------------
Usage tips
64tass is a command line assembler, the source can be written in any text
editor. As a minimum the source filename must be given on the command line. The
`-a' command line option is highly recommended if the source is Unicode or
ASCII.
64tass -a src.asm
There are also some useful parameters which are described later.
For comfortable compiling I use such `Makefile's (for make):
demo.prg: source.asm macros.asm pic.drp music.bin
64tass -C -a -B -i source.asm -o demo.tmp
pucrunch -ffast -x 2048 demo.tmp >demo.prg
This way `demo.prg' is recreated by compiling `source.asm' whenever
`source.asm', `macros.asm', `pic.drp' or `music.bin' had changed.
Of course it's not much harder to create something similar for win32
(make.bat), however this will always compile and compress:
64tass.exe -C -a -B -i source.asm -o demo.tmp
pucrunch.exe -ffast -x 2048 demo.tmp >demo.prg
Here's a slightly more advanced Makefile example with default action as testing
in VICE, clean target for removal of temporary files and compressing using an
intermediate temporary file:
all: demo.prg
x64 -autostartprgmode 1 -autostart-warp +truedrive +cart $<
demo.prg: demo.tmp
pucrunch -ffast -x 2048 $< >$@
demo.tmp: source.asm macros.asm pic.drp music.bin
64tass -C -a -B -i $< -o $@
.INTERMEDIATE: demo.tmp
.PHONY: all clean
clean:
$(RM) demo.prg demo.tmp
It's useful to add a basic header to your source files like the one below, so
that the resulting file is directly runnable without additional compression:
* = $0801
.word (+), 2005 ;pointer, line number
.null $9e, format("%4d", start);will be sys 4096
+ .word 0 ;basic line end
* = $1000
start rts
A frequently coming up question is, how to automatically allocate memory,
without hacks like *=*+1? Sure there's .byte and friends for variables with
initial values but what about zero page, or RAM outside of program area? The
solution is to not use an initial value by using `?' or not giving a fill byte
value to .fill.
* = $02
p1 .addr ? ;a zero page pointer
temp .fill 10 ;a 10 byte temporary area
Space allocated this way is not saved in the output as there's no data to save
at those addresses.
What about some code running on zero page for speed? It needs to be relocated,
and the length must be known to copy it there. Here's an example:
ldx #size(zpcode)-1;calculate length
- lda zpcode,x
sta wrbyte,x
dex ;install to zero page
bpl -
jsr wrbyte
rts
;code continues here but is compiled to run from $02
zpcode .logical $02
wrbyte sta $ffff ;quick byte writer at $02
inc wrbyte+1
bne +
inc wrbyte+2
+ rts
.endlogical
The assembler supports lists and tuples, which does not seems interesting at
first as it sound like something which is only useful when heavy scripting is
involved. But as normal arithmetic operations also apply on all their elements
at once, this could spare quite some typing and repetition.
Let's take a simple example of a low/high byte jump table of return addresses,
this usually involves some unnecessary copy/pasting to create a pair of tables
with constructs like >(label-1).
jumpcmd lda hibytes,x ; selected routine in X register
pha
lda lobytes,x ; push address to stack
pha
rts ; jump, rts will increase pc by one!
; Build a list of jump addresses minus 1
_ := (cmd_p, cmd_c, cmd_m, cmd_s, cmd_r, cmd_l, cmd_e)-1
lobytes .byte <_ ; low bytes of jump addresses
hibytes .byte >_ ; high bytes
There are some other tips below in the descriptions.
-------------------------------------------------------------------------------
Expressions and data types
Integer constants
Integer constants can be entered as decimal digits of arbitrary length. An
underscore can be used between digits as a separator for better readability of
long numbers. The following operations are accepted:
Integer operators and functions
x + y add x to y 2 + 2 is 4
x - y subtract y from x 4 - 1 is 3
x * y multiply x with y 2 * 3 is 6
x / y integer divide x by y 7 / 2 is 3
x % y integer modulo of x divided by y 5 % 2 is 1
x ** y x raised to power of y 2 ** 4 is 16
-x negated value -2 is -2
+x unchanged +2 is 2
~x -x - 1 ~3 is -4
x | y bitwise or 2 | 6 is 6
x ^ y bitwise xor 2 ^ 6 is 4
x & y bitwise and 2 & 6 is 2
x << y logical shift left 1 << 3 is 8
x >> y arithmetic shift right -8 >> 3 is -1
Integers are automatically promoted to floats as necessary in expressions.
Other types can be converted to integer using the integer type int.
Integer division is a floor division (rounding down) so 7 / 4 is 1 and not
1.75. If ceiling division is required (rounding up) that can be done by
negating both the divident and the result. Typically it's done like 0 - -5 / 4
which results in 2.
.byte 23 ; as unsigned
.char -23 ; as signed
; using negative integers as immediate values
ldx #-3 ; works as '#-' is signed immediate
num = -3
ldx #+num ; needs explicit '#+' for signed 8 bits
lda #((bitmap >> 10) & $0f) | ((screen >> 6) & $f0)
sta $d018
Bit string constants
Bit string constants can be entered in hexadecimal form with a leading dollar
sign or in binary with a leading percent sign. An underscore can be used
between digits as a separator for better readability of long numbers. The
following operations are accepted:
Bit string operators and functions
~x invert bits ~%101 is ~%101
y .. x concatenate bits $a .. $b is $ab
y x n repeat %101 x 3 is %101101101
x[n] extract bit(s) $a[1] is %1
x[s] slice bits $1234[4:8] is $3
x | y bitwise or ~$2 | $6 is ~$0
x ^ y bitwise xor ~$2 ^ $6 is ~$4
x & y bitwise and ~$2 & $6 is $4
x << y bitwise shift left $0f << 4 is $0f0
x >> y bitwise shift right ~$f4 >> 4 is ~$f
Length of bit string constants are defined in bits and is calculated from the
number of bit digits used including leading zeros.
Bit strings are automatically promoted to integer or floating point as
necessary in expressions. The higher bits are extended with zeros or ones as
needed.
Bit strings support indexing and slicing. This is explained in detail in
section `Slicing and indexing'.
Other types can be converted to bit string using the bit string type bits.
.byte $33 ; 8 bits in hexadecimal
.byte %00011111 ; 8 bits in binary
.text $1234 ; $34, $12 (little endian)
lda $01
and #~$07 ; 8 bits even after inversion
ora #$05
sta $01
lda $d015
and #~%00100000 ;clear a bit
sta $d015
Floating point constants
Floating point constants have a radix point in them and optionally an exponent.
A decimal exponent is `e' while a binary one is `p'. An underscore can be used
between digits as a separator for better readability. The following operations
can be used:
Floating point operators and functions
x + y add x to y 2.2 + 2.2 is 4.4
x - y subtract y from x 4.1 - 1.1 is 3.0
x * y multiply x with y 1.5 * 3 is 4.5
x / y integer divide x by y 7.0 / 2.0 is 3.5
x % y integer modulo of x divided by y 5.0 % 2.0 is 1.0
x ** y x raised to power of y 2.0 ** -1 is 0.5
-x negated value -2.0 is -2.0
+x unchanged +2.0 is 2.0
~x almost -x ~2.1 is almost -2.1
x | y bitwise or 2.5 | 6.5 is 6.5
x ^ y bitwise xor 2.5 ^ 6.5 is 4.0
x & y bitwise and 2.5 & 6.5 is 2.5
x << y logical shift left 1.0 << 3.0 is 8.0
x >> y arithmetic shift right -8.0 >> 4 is -0.5
As usual comparing floating point numbers for (non) equality is a bad idea due
to rounding errors.
The only predefined constant is pi.
Floating point numbers are automatically truncated to integer as necessary.
Other types can be converted to floating point by using the type float.
Fixed point conversion can be done by using the shift operators. For example an
8.16 fixed point number can be calculated as (3.14 << 16) & $ffffff. The binary
operators operate like if the floating point number would be a fixed point one.
This is the reason for the strange definition of inversion.
.byte 3.66e1 ; 36.6, truncated to 36
.byte $1.8p4 ; 4:4 fixed point number (1.5)
.sint 12.2p8 ; 8:8 fixed point number (12.2)
Character string constants
Character strings are enclosed in single or double quotes and can hold any
Unicode character.
Operations like indexing or slicing are always done on the original
representation. The current encoding is only applied when it's used in
expressions as numeric constants or in context of text data directives.
Doubling the quotes inside string literals escapes them and results in a single
quote.
Character string operators and functions
y .. x concatenate strings "a" .. "b" is "ab"
y in x is substring of "b" in "abc" is true
a x n repeat "ab" x 3 is "ababab"
a[i] character from start "abc"[1] is "b"
a[-i] character from end "abc"[-1] is "c"
a[:] no change "abc"[:] is "abc"
a[s:] cut off start "abc"[1:] is "bc"
a[:-s] cut off end "abc"[:-1] is "ab"
a[s] reverse "abc"[::-1] is "cba"
Character strings are converted to integers, byte and bit strings as necessary
using the current encoding and escape rules. For example when using a sane
encoding "z"-"a" is 25.
Other types can be converted to character strings by using the type str or by
using the repr and format functions.
Character strings support indexing and slicing. This is explained in detail in
section `Slicing and indexing'.
mystr = "oeU" ; character string constant
.text 'it''s' ; it's
.word "ab"+1 ; conversion result is "bb" usually
.text "text"[:2] ; "te"
.text "text"[2:] ; "xt"
.text "text"[:-1] ; "tex"
.text "reverse"[::-1]; "esrever"
Byte string constants
Byte strings are like character strings, but hold bytes instead of characters.
Quoted character strings prefixing by `b', `l', `n', `p', `s', `x' or `z'
characters can be used to create byte strings. The resulting byte string
contains what .text, .shiftl, .null, .ptext and .shift would create. Direct
hexadecimal entry can be done using the `x' prefix and `z' denotes a z85
encoded byte string. Spaces can be used between pairs of hexadecimal digits as
a separator for better readability.
Byte string operators and functions
y .. x concatenate strings x"12" .. x"34" is x"1234"
y in x is substring of x"34" in x"1234" is true
a x n repeat x"ab" x 3 is x"ababab"
a[i] byte from start x"abcd12"[1] is x"cd"
a[-i] byte from end x"abcd"[-1] is x"cd"
a[:] no change x"abcd"[:] is x"abcd"
a[s:] cut off start x"abcdef"[1:] is x"cdef"
a[:-s] cut off end x"abcdef"[:-1] is x"abcd"
a[s] reverse x"abcdef"[::-1] is x"efcdab"
Byte strings support indexing and slicing. This is explained in detail in
section `Slicing and indexing'.
Other types can be converted to byte strings by using the type bytes.
.enc "screen" ;use screen encoding
mystr = b"oeU" ;convert text to bytes, like .text
.enc "none" ;normal encoding
.text mystr ;text as originally encoded
.text s"p1" ;convert to bytes like .shift
.text l"p2" ;convert to bytes like .shiftl
.text n"p3" ;convert to bytes like .null
.text p"p4" ;convert to bytes like .ptext
Binary data may be embedded in source code by using hexadecimal byte strings.
This is more compact than using .byte followed by a lot of numbers. As expected
1 byte becomes 2 characters.
.text x"fce2" ;2 bytes: $fc and $e2 (big endian)
If readability is not a concern then the more compact z85 encoding may be used
which encodes 4 bytes into 5 characters. Data lengths not a multiple of 4 are
handled by omitting leading zeros in the last group.
.text z"FiUj*2M$hf";8 bytes: 80 40 20 10 08 04 02 01
For data lengths of multiple of 4 bytes any z85 encoder will do. Otherwise the
simplest way to encode a binary file into a z85 string is to create a source
file which reads it using the line `label = binary('filename')'. Now if the
labels are listed to a file then there will be a z85 encoded definition for
this label.
Lists and tuples
Lists and tuples can hold a collection of values. Lists are defined from values
separated by comma between square brackets [1, 2, 3], an empty list is [].
Tuples are similar but are enclosed in parentheses instead. An empty tuple is
(), a single element tuple is (4,) to differentiate from normal numeric
expression parentheses. When nested they function similar to an array. Both
types are immutable.
List and tuple operators and functions
y .. x concatenate lists [1] .. [2] is [1, 2]
y in x is member of list 2 in [1, 2, 3] is true
a x n repeat [1, 2] x 2 is [1, 2, 1, 2]
a[i] element from start ("1", 2)[1] is 2
a[-i] element from end ("1", 2, 3)[-1] is 3
a[:] no change (1, 2, 3)[:] is (1, 2, 3)
a[s:] cut off start (1, 2, 3)[1:] is (2, 3)
a[:-s] cut off end (1, 2.0, 3)[:-1] is (1, 2.0)
a[s] reverse (1, 2, 3)[::-1] is (3, 2, 1)
*a convert to arguments format("%d: %s", *mylist)
... op a left fold ... + (1, 2, 3) is ((1+2)+3)
a op ... right fold (1, 2, 3) - ... is (1-(2-3))
Arithmetic operations are applied on the all elements recursively, therefore
[1, 2] + 1 is [2, 3], and abs([1, -1]) is [1, 1].
Arithmetic operations between lists are applied one by one on their elements,
so [1, 2] + [3, 4] is [4, 6].
When lists form an array and columns/rows are missing the smaller array is
stretched to fill in the gaps if possible, so [[1], [2]] * [3, 4] is [[3, 4],
[6, 8]].
Lists and tuples support indexing and slicing. This is explained in detail in
section `Slicing and indexing'.
mylist = [1, 2, "whatever"]
mytuple = (cmd_e, cmd_g)
mylist = ("e", cmd_e, "g", cmd_g, "i", cmd_i)
keys .text mylist[::2] ; keys ("e", "g", "i")
call_l .byte <mylist[1::2]-1; routines (<cmd_e-1, <cmd_g-1, <cmd_i-1)
call_h .byte >mylist[1::2]-1; routines (>cmd_e-1, >cmd_g-1, >cmd_i-1)
Although lists elements of variables can't be changed using indexing (at the
moment) the same effect can be achieved by combining slicing and concatenation:
lst := lst[:2] .. [4] .. lst[3:]; same as lst[2] := 4 would be
Folding is done on pair of elements either forward (left) or reverse (right).
The list must contain at least one element. Here are some folding examples:
minimum = size([part1, part2, part3]) <? ...
maximum = size([part1, part2, part3]) >? ...
sum = size([part1, part2, part3]) + ...
xorall = list_of_numbers ^ ...
join = list_of_strings .. ...
allbits = sprites.(left, middle, right).bits | ...
all = [true, true, true, true] && ...
any = [false, false, false, true] || ...
The range(start, end, step) built-in function can be used to create lists of
integers in a range with a given step value. At least the end must be given,
the start defaults to 0 and the step to 1. Sounds not very useful, so here are
a few examples:
;Bitmask table, 8 bits from left to right
.byte %10000000 >> range(8)
;Classic 256 byte single period sinus table with values of 0-255.
.byte 128 + 127.5 * sin(range(256) * pi / 128)
;Screen row address tables
_ := $400 + range(0, 1000, 40)
scrlo .byte <_
scrhi .byte >_
Dictionaries
Dictionaries hold key and value pairs. Definition is done by collecting
key:value pairs separated by comma between braces {"key":"value", :"default
value"}.
Looking up a non-existing key is normally an error unless a default value is
given. An empty dictionary is {}. This type is immutable. There are limitations
what may be used as a key but the value can be anything.
Dictionary operators and functions
y .. x combine dictionaries {1:2, 3:4} .. {2:3, 3:1} is {1:2, 2:3, 3:1}
x[i] value lookup {"1":2}["1"] is 2
x.i symbol lookup {.ONE:1, .TWO:2}.ONE is 1
y in x is a key 1 in {1:2} is true
; Simple lookup
.text {1:"one", 2:"two"}[2]; "two"
; 16 element "fader" table 1->15->12->11->0
.byte {1:15, 15:12, 12:11, :0}[range(16)]
; Symbol accessible values. May be useful as a function return value too.
coords = {.x: 24, .y: 50}
ldx #coords.x
ldy #coords.y
Code
Code holds the result of compilation in binary and other enclosed objects. In
an arithmetic operation it's used as the numeric address of the memory where it
starts. The compiled content remains static even if later parts of the source
overwrite the same memory area.
Indexing and slicing of code to access the compiled content might be
implemented differently in future releases. Use this feature at your own risk
for now, you might need to update your code later.
Label operators and functions
a.b b member of a label.locallabel
.b in a if a has symbol b .locallabel in label
a[i] element from start label[1]
a[-i] element from end label[-1]
a[:] copy as tuple label[:]
a[s:] cut off start, as tuple label[1:]
a[:-s] cut off end, as tuple label[:-1]
a[s] reverse, as tuple label[::-1]
mydata .word 1, 4, 3
mycode .block
local lda #0
.endblock
ldx #size(mydata) ;6 bytes (3*2)
ldx #len(mydata) ;3 elements
ldx #mycode[0] ;lda instruction, $a9
ldx #mydata[1] ;2nd element, 4
jmp mycode.local ;address of local label
Addressing modes
Addressing modes are used for determining addressing modes of instructions.
For indexing there must be no white space between the comma and the register
letter, otherwise the indexing operator is not recognized. On the other hand
put a space between the comma and a single letter symbol in a list to avoid it
being recognized as an operator.
Addressing mode operators
# immediate
#+ signed immediate
#- signed immediate
( ) indirect
[ ] long indirect
,b data bank indexed
,d direct page indexed
,k program bank indexed
,r data stack pointer indexed
,s stack pointer indexed
,x x register indexed
,y y register indexed
,z z register indexed
Parentheses are used for indirection and square brackets for long indirection.
These operations are only available after instructions and functions to not
interfere with their normal use in expressions.
Several addressing mode operators can be combined together. Currently the
complexity is limited to 4 operators. This is enough to describe all addressing
modes of the supported CPUs.
Valid addressing mode operator combinations
# immediate lda #$12
#+ signed immediate lda #+127
#- signed immediate lda #-128
#addr,#addr move mvp #5,#6
addr direct or relative lda $12 lda $1234 bne $1234
bit,addr direct page bit rmb 5,$12
bit,addr,addr direct page bit relative jump bbs 5,$12,$1234
(addr) indirect lda ($12) jmp ($1234)
(addr),y indirect y indexed lda ($12),y
(addr),z indirect z indexed lda ($12),z
(addr,x) x indexed indirect lda ($12,x) jmp ($1234,x)
[addr] long indirect lda [$12] jmp [$1234]
[addr],y long indirect y indexed lda [$12],y
#addr,b data bank indexed lda #0,b
#addr,b,x data bank x indexed lda #0,b,x
#addr,b,y data bank y indexed lda #0,b,y
#addr,d direct page indexed lda #0,d
#addr,d,x direct page x indexed lda #0,d,x
#addr,d,y direct page y indexed ldx #0,d,y
(#addr,d) direct page indirect lda (#$12,d)
(#addr,d,x) direct page x indexed indirect lda (#$12,d,x)
(#addr,d),y direct page indirect y indexed lda (#$12,d),y
(#addr,d),z direct page indirect z indexed lda (#$12,d),z
[#addr,d] direct page long indirect lda [#$12,d]
[#addr,d],y direct page long indirect y indexed lda [#$12,d],y
#addr,k program bank indexed jsr #0,k
(#addr,k,x) program bank x indexed indirect jmp (#$1234,k,x)
#addr,r data stack indexed lda #1,r
(#addr,r),y data stack indexed indirect y indexed lda (#$12,r),y
#addr,s stack indexed lda #1,s
(#addr,s),y stack indexed indirect y indexed lda (#$12,s),y
addr,x x indexed lda $12,x
addr,y y indexed lda $12,y
Direct page, data bank, program bank indexed and long addressing modes of
instructions are intelligently chosen based on the instruction type, the
address ranges set up by .dpage, .databank and the current program counter
address. Therefore the `,d', `,b' and `,k' indexing is only used in very
special cases.
The immediate direct page indexed `#0,d' addressing mode is usable for direct
page access. The 8 bit constant is a direct offset from the start of actual
direct page. Alternatively it may be written as `0,d'.
The immediate data bank indexed `#0,b' addressing mode is usable for data bank
access. The 16 bit constant is a direct offset from the start of actual data
bank. Alternatively it may be written as `0,b'.
The immediate program bank indexed `#0,k' addressing mode is usable for program
bank jumps, branches and calls. The 16 bit constant is a direct offset from the
start of actual program bank. Alternatively it may be written as `0,k'.
The immediate stack indexed `#0,s' and data stack indexed `#0,r' accept 8 bit
constants as an offset from the start of (data) stack. These are sometimes
written without the immediate notation, but this makes it more clear what's
going on. For the same reason the move instructions are written with an
immediate addressing mode `#0,#0' as well.
The immediate (#) addressing mode expects unsigned values of byte or word size.
Therefore it only accepts constants of 1 byte or in range 0-255 or 2 bytes or
in range 0-65535.
The signed immediate (#+ and #-) addressing mode is to allow signed numbers to
be used as immediate constants. It accepts a single byte or an integer in range
-128-127, or two bytes or an integer of -32768-32767.
The use of signed immediate (like #-3) is seamless, but it needs to be
explicitly written out for variables or expressions (#+variable). In case the
unsigned variant is needed but the expression starts with a negation then it
needs to be put into parentheses (#(-variable)) or else it'll change the
address mode to signed.
Normally addressing mode operators are used in expressions right after
instructions. They can also be used for defining stack variable symbols when
using a 65816, or to force a specific addressing mode.
param = #1,s ;define a stack variable
const = #1 ;immediate constant
lda #0,b ;always "absolute" lda $0000
lda param ;results in lda #$01,s
lda param+1 ;results in lda #$02,s
lda (param),y ;results in lda (#$01,s),y
ldx const ;results in ldx #$01
lda #-2 ;negative constant, $fe
Uninitialized memory
There's a special value for uninitialized memory, it's represented by a
question mark. Whenever it's used to generate data it creates a `hole' where
the previous content of memory is visible.
Uninitialized memory holes without previous content are not saved unless it's
really necessary for the output format, in that case it's replaced with zeros.
It's not just data generation statements (e.g. .byte) that can create
uninitialized memory, but .fill, .align or address manipulation as well.
* = $200 ;bytes as necessary
.word ? ;2 bytes
.fill 10 ;10 bytes
.align 64 ;bytes as necessary
Booleans
There are two predefined boolean constant variables, true and false.
Booleans are created by comparison operators (<, <=, !=, ==, >=, >), logical
operators (&&, ||, ^^, !), the membership operator (in) and the all and any
functions.
Normally in numeric expressions true is 1 and false is 0, unless the `
-Wstrict-bool' command line option was used.
Other types can be converted to boolean by using the type bool.
Boolean values of various types
bits At least one non-zero bit
bool When true
bytes At least one non-zero byte
code Address is non-zero
float Not 0.0
int Not zero
str At least one non-zero byte after translation
Types
The various types mentioned earlier have predefined names. These can used for
conversions or type checks.
Built-in type names
address Address type
bits Bit string type
bool Boolean type
bytes Byte string type
code Code type
dict Dictionary type
float Floating point type
gap Uninitialized memory type
int Integer type
list List type
str Character string type
symbol Symbol type
tuple Tuple type
type Type type
Bit and byte string conversions can take a second parameter to specify and
exact size. Values which can fit in shorter space will be padded but longer
ones give an error.
bits(<expression>[, <bit count>])
Convert to the specific number of bits. If the number of bits is negative
then it's a signed.
bytes(<expression>[, <byte count>])
Convert to the specific number of bytes. If the number of bits is negative
then it's a signed.
.cerror type(var) != str, "Not a string!"
.text str(year) ; convert to string
Symbols
Symbols are used to reference objects. Regularly named, anonymous and local
symbols are supported. These can be constant or re-definable.
Scopes are where symbols are stored and looked up. The global scope is always
defined and it can contain any number of nested scopes.
Symbols must be uniquely named in a scope, therefore in big programs it's hard
to come up with useful and easy to type names. That's why local and anonymous
symbols exists. And grouping certain related symbols into a scope makes sense
sometimes too.
Scopes are usually created by .proc and .block directives, but there are a few
other ways. Symbols in a scope can be accessed by using the dot operator, which
is applied between the name of the scope and the symbol (e.g.
myconsts.math.pi).
Regular symbols
Regular symbol names are starting with a letter and containing letters, numbers
and underscores. Unicode letters are allowed if the `-a' command line option
was used. There's no restriction on the length of symbol names.
Care must be taken to not use duplicate names in the same scope when the symbol
is used as a constant as there can be only one definition for them.
Duplicate names in parent scopes are not a problem and this gives the ability
to override names defined in lower scopes. However this can just as well lead
to mistakes if a lower scoped symbol with the same name was meant so there's a
`-Wshadow' command line option to warn if such ambiguity exists.
Case sensitivity can be enabled with the `-C' command line option, otherwise
all symbols are matched case insensitive.
For case insensitive matching it's possible to check for consistent symbol name
use with the `-Wcase-symbol' command line option.
A regular symbol is looked up first in the current scope, then in lower scopes
until the global scope is reached.
f .block
g .block
n nop ;jump here
.endblock
.endblock
jsr f.g.n ;reference from a scope
f.x = 3 ;create x in scope f with value 3
Local symbols
Local symbols have their own scope between two regularly named code symbols and
are assigned to the code symbol above them.
Therefore they're easy to reuse without explicit scope declaration directives.
Not all regularly named symbols can be scope boundaries just plain code symbol
ones without anything or an opcode after them (no macros!). Symbols defined as
procedures, blocks, macros, functions, structures and unions are ignored. Also
symbols defined by .var, := or = don't apply, and there are a few more
exceptions, so stick to using plain code labels.
The name must start with an underscore (_), otherwise the same character
restrictions apply as for regular symbols. There's no restriction on the length
of the name.
Care must be taken to not use the duplicate names in the same scope when the
symbol is used as a constant.
A local symbol is only looked up in it's own scope and nowhere else.
incr inc ac
bne _skip
inc ac+1
_skip rts
decr lda ac
bne _skip
dec ac+1
_skip dec ac ;symbol reused here
jmp incr._skip ;this works too, but is not advised
Anonymous symbols
Anonymous symbols don't have a unique name and are always called as a single
plus or minus sign. They are also called as forward (+) and backward (-)
references.
When referencing them `-' means the first backward, `--' means the second
backwards and so on. It's the same for forward, but with `+'. In expressions it
may be necessary to put them into brackets.
ldy #4
- ldx #0
- txa
cmp #3
bcc +
adc #44
+ sta $400,x
inx
bne -
dey
bne --
Excessive nesting or long distance references create poorly readable code. It's
also very easy to copy-paste a few lines of code with these references into a
code fragment already containing similar references. The result is usually a
long debugging session to find out what went wrong.
These references are also useful in segments, but this can create a nice trap
when segments are copied into the code with their internal references.
bne +
#somemakro ;let's hope that this segment does
+ nop ;not contain forward references...
Anonymous symbols are looked up first in the current scope, then in lower
scopes until the global scope is reached.
Anonymous labels within conditionally assembled code are counted even if the
code itself is not compiled and the label won't get defined. This ensures that
anonymous labels are always at the same "distance" independent of the
conditions in between.
Constant and re-definable symbols
Constant symbols can be created with the equal sign. These are not
re-definable. Forward referencing of them is allowed as they retain the objects
over compilation passes.
Symbols in front of code or certain assembler directives are created as
constant symbols too. They are bound to the object following them.
Re-definable symbols can be created by the .var directive or := construct.
These are also called as variables. They don't carry their content over from
the previous pass therefore it's not possible to use them before their
definition.
If the variable already exists in the current scope it'll get updated. If an
existing variable needs to be updated in a parent scope then the ::= variable
reassign operator is able to do that.
Variables can be conditionally defined using the :?= construct. If the variable
was defined already then the original value is retained otherwise a new one is
created with this value.
WIDTH = 40 ;a constant
lda #WIDTH ;lda #$28
variabl .var 1 ;a variable
var2 := 1 ;another variable
variabl .var variabl + 1;update it verbosely
var2 += 1 ;compound assignment (add one)
var3 :?= 5 ;assign 5 if undefined
The star label
The `*' symbol denotes the current program counter value. When accessed it's
value is the program counter at the beginning of the line. Assigning to it
changes the program counter and the compiling offset.
Built-in functions
Built-in functions are pre-assigned to the symbols listed below. If you reuse
these symbols in a scope for other purposes then they become inaccessible, or
can perform a different function.
Built-in functions can be assigned to symbols (e.g. sinus = sin), and the new
name can be used as the original function. They can even be passed as
parameters to functions.
Mathematical functions
floor(<expression>)
Round down. E.g. floor(-4.8) is -5.0
round(<expression>)
Round to nearest away from zero. E.g. round(4.8) is 5.0
ceil(<expression>)
Round up. E.g. ceil(1.1) is 2.0
trunc(<expression>)
Round down towards zero. E.g. trunc(-1.9) is -1
frac(<expression>)
Fractional part. E.g. frac(1.1) is 0.1
sqrt(<expression>)
Square root. E.g. sqrt(16.0) is 4.0
cbrt(<expression>)
Cube root. E.g. cbrt(27.0) is 3.0
log10(<expression>)
Common logarithm. E.g. log10(100.0) is 2.0
log(<expression>)
Natural logarithm. E.g. log(1) is 0.0
exp(<expression>)