Strings Operations Assembly Language
Strings Operations Assembly Language
In programming, a string can be defined as a series of data byte or word available in consecutive memory locations. We can have either a word string or a byte string. Memory for strings is often allocated in a sequential order. There are generally two ways to specify the length of a string:
A string length can be explicitly stored using the $ location counter symbol. The symbol represents the current value of the location counter. Additionally, the $ points to the byte after the last character of the string variable. The other way we can store strings is by using a trailing sentinel character. It delimits a string instead of storing the string length explicitly. The sentinel character is a special character. It should not appear in the string.
String manipulation instructions are the rules and regulations used to manipulate strings. A destination operand, source operand or both are required for each stringinstruction. String instructions for 32-bit segments use ESI and EDI registers. The registers point to the destination and source operands, respectively.
Strings in assembly language can be processed with the following five basic instructions. Our assembly language assignment helpers have discussed the list of string manipulation instructions in assembly language below:
This string instruction moves one byte, double word or word of data from one memory to another.
The LODS instruction loads from the memory. A one-byte operand is loaded into the AL register, a one-word operand is loaded into the AX register, and a double word is loaded into the EAX register.
The STOS instruction stores data from the EAX, AL, and AX registers to memory.
This is the comparison instruction. The CMPS compares two data items in memory. The data could be of word, byte size, or double word.
The SCAS instruction compares the contents of the AL, EAX and AX registers with the contents of an item in memory.
All the above instructions have a byte, double word, and word version. They can be repeated by using a repetition prefix. String instructions use the DS:SI and ES:DI pair of registers. The SI and DI registers have valid offset addresses that refer to the bytes stored in memory. DI is associated with extra segment (ES). On the other hand, SI is usually associated with data segment (DS). The EDI (ES: DI) and the ESI (DS:SI) are registers that point to the destination and source operands respectively. The destination operand is assumed to be at EDI (ES: DI) while the source operand is assumed to be at ESI (DS:SI). The ESI and EDI registers are used for 32-bit addresses. Consequently, SI ad DI registers are used for 16-bit registers.
- Storing string length explicitly
- Using a sentinel character
The REP prefix causes repetition of the instruction based on a counter placed at the CX register when set before a string instruction. For example, REP MOVSB. It decreases CX by 1, executes the instruction and checks if CX is zero. If CX is not zero, the REP prefix will repeat the instruction processing until it is zero.
The direction of the operation is determined by the DF (direction flag).
The variations mentioned below are also found in the REP prefix:
- To make the operation left to right, we use the Clear Direction Flag, (CLD, DF = 0)
- To make the operation right to left, we use the set direction flag (STD, DF = 1)
- REP – This is the unconditional repeat. The REP repeats the operation until the CX is zero
- REPE - It is also called REPZ and is a conditional repeat. REPE repeats the operation if the zero flag indicates zero or equal. It will only stop if the zero flag is not equal to zero or when the CX is zero
- REPNE – This variation is sometimes called REPNZ. REPNE is also a conditional repeat. While the zero flag indicates not equal or zero, it will repeat the operation. REPNE only stops when The CX is decremented to zero or the ZF indicates zero or equal.
Types of Strings in Assembly Language
A constant string is a region of memory with some ASCII characters in it. A zero-byte indicates the end of a string while one ASCII character is one byte. This means that you can create strings too the same way as you can create bytes with known values.
An individual byte can be accessed from memory with the syntax BYTE [address]. Most assembly language instructions prefer doublewords to bytes. For this reason, it is recommended that you use a byte-friendly instruction like MOVZX (move zero extend).
C functions like puts and gets can be used to write strings to the screen. They take an argument that points to the string data to write or read.
We have a team of programming professionals who can assist you with your assembly language homework. Contact us for instant help now.