Skip to main content

String Basics

BBj strings use a $ suffix by convention and are byte-oriented. The core operations -- length, extraction, trimming, case conversion -- are each a single function call.

LEN() -- String Length

LEN() returns the length of a string in bytes:

a$ = "Hello World"
print len(a$)
rem Output: 11

Bytes vs. Characters

LEN() returns bytes, not Unicode characters. For strings containing only ASCII characters, bytes equals characters. For non-ASCII text (accented characters, emoji, CJK), one character may occupy multiple bytes:

a$ = "Hello"
print len(a$)
rem Output: 5 (ASCII: 1 byte per character)

If you need the character count for a Unicode string, use the BBjString object's length() method:

rem For Unicode-aware length, use BBjString::length()
s! = new BBjString("cafe")
print s!.length()

For most BBj business applications, strings are ASCII and LEN() is all you need. Be aware of the distinction when processing internationalized data.

Substring Extraction -- A$(pos,len)

BBj uses a 1-based position and length notation to extract substrings. This replaces MID$, LEFT$, and RIGHT$ from other BASIC dialects -- BBj does not use those functions.

a$ = "Hello World"

rem Extract 5 characters starting at position 1
print a$(1,5)
rem Output: Hello

rem Extract from position 7 to end of string
print a$(7)
rem Output: World

rem Extract a single character
print a$(3,1)
rem Output: l

Common patterns:

ExpressionMeaningEquivalent in Java
a$(1,n)First n charactersa.substring(0, n)
a$(len(a$)-n+1)Last n charactersa.substring(a.length()-n)
a$(p,n)n characters from position pa.substring(p-1, p-1+n)
a$(p)From position p to enda.substring(p-1)
caution

Positions are 1-based, not 0-based. a$(1,1) is the first character, not a$(0,1).

Concatenation

The + operator concatenates strings:

first$ = "Hello"
last$ = "World"
full$ = first$ + " " + last$
print full$
rem Output: Hello World

CVS() -- String Cleanup

CVS() uses a bitmask to combine multiple string operations in a single call. Each bit enables a different transformation:

ValueOperation
1Strip leading spaces
2Strip trailing spaces
4Convert to uppercase
8Strip non-printable characters
16Collapse multiple consecutive spaces to one
32Convert to lowercase
64Swap commas and periods (European number formatting)

Add values together to combine operations:

b$ = "  hello world  "

rem Strip leading + trailing (1 + 2 = 3)
print cvs(b$, 3)
rem Output: "hello world"

rem Strip leading + trailing + uppercase (1 + 2 + 4 = 7)
print cvs(b$, 7)
rem Output: "HELLO WORLD"

CVS() with value 3 (strip both sides) is the BBj equivalent of Java's String.trim(). Using value 7 (strip + uppercase) is a common pattern for case-insensitive comparisons:

rem Case-insensitive comparison
if cvs(input$, 7) = "YES" then print "Confirmed"

String Variables and the $ Suffix

In BBj, the $ suffix indicates a string variable. This is a language requirement, not just a naming convention -- the suffix determines the variable's type:

name$ = "Alice"
greeting$ = "Hello, " + name$

Numeric variables have no suffix (or % for integers), and object references use !. See the Getting Started chapter for the full variable type system.

Reading Legacy Code

See Reading Legacy Code for uppercase keywords, line numbers, MID$/LEFT$/RIGHT$, and other historical string patterns.

Further Reading