0% found this document useful (0 votes)
22 views

MPL write_ups

The document outlines three assembly language programming experiments focusing on accepting hexadecimal numbers, calculating string lengths, and finding the largest number among various data types. It provides a comprehensive introduction to assembly language, its advantages, and the necessary tools like NASM for development. Each experiment includes problem statements, theoretical background, algorithms, and steps for implementation.

Uploaded by

hetavimodi2005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

MPL write_ups

The document outlines three assembly language programming experiments focusing on accepting hexadecimal numbers, calculating string lengths, and finding the largest number among various data types. It provides a comprehensive introduction to assembly language, its advantages, and the necessary tools like NASM for development. Each experiment includes problem statements, theoretical background, algorithms, and steps for implementation.

Uploaded by

hetavimodi2005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Experiment 1

Title Accept numbers , store & display


Problem statement Write an X86/64 ALP to accept five 64 bit Hexadecimal numbers from
user and store them in an array and display the accepted numbers

Theory :

Introduction to Assembly Language Programming:

What is Assembly Language?


A processor understands only machine language instructions, which are combinations of 1's
and 0's. However, machine language is too complex for using in software development. So, the
low-level assembly language is designed for a specific family of processors that represents
various instructions in symbolic code ( called mnemonic) which is more understandable form.
Assembly language is a low-level computer language. It is processor dependent,
since it represents machine codes of instructions of a particular processor with relevant words
( called mnemonic),which makes it easier for a programmer to write a program instead of writing
in machine code format consisting of only 0s & 1s .
Assembler is a software which converts program written in assembly language into the machine
code ( instructions ) of a particular CPU .

High level language Low level language


More close to English & so easy to learn & Less close to English & so difficult to learn &
understand understand
Require more memory consume less memory
do not depend on machines. : These are machine-dependent These are not portable from
portable from any one device to another. any one device to another with different
processor or design.
very easy to debug Difficult to debug
allow a higher abstraction. allow very little abstraction or no abstraction at
all.
One does not require knowledge of hardware Having knowledge of hardware is a prerequisite
for writing programs. to writing programs.
Every single statement in it, may execute a One statement is converted to one machine
bunch of instructions. instruction
Examples : C C++, JAVA, Basic, Python etc Examples : one assembly language for each
processor

Advantages of Assembly Language


Having an understanding of assembly language makes one aware of -
• How programs interface with OS, processor, and BIOS;
• How data is represented in memory and other external devices;
• How the processor accesses and executes instruction;
• How instructions access and process data;
• How a program accesses external devices.
Other advantages of using assembly language are -
• It requires less memory and execution time;
• It allows hardware-specific complex jobs in an easier way;
• It is suitable for time-critical jobs;
• It is most suitable for writing interrupt service routines and other memory resident programs &
for system programming

There are many assembler programs, such as -


• Microsoft Assembler (MASM)
• Borland Turbo Assembler (TASM)
• The GNU assembler (GAS)

We will use the NASM (Netwide) assembler, as it is -


• Free
• lots of information on net.
• Could be used on both Linux and Windows.

Installing NASM

If you select "Development Tools" while installing Linux, you may get NASM installed along
with the Linux operating system and you do not need to download and install it separately. For
checking whether you already have NASM installed, take the following steps -
• Open a Linux terminal.
• Type command whereis nasm
• If it is already installed, then a line like, nasm: /usr/bin/nasm is displayed.
Otherwise, you will see just nasm:, then you need to install NASM.
To install NASM, take the following steps -
• Check The netwide assembler (NASM) website for the latest version.
• Download the Linux source archive nasm-X.XX.ta.gz, where X.XX is the
NASM version number in the archive.
• run the command $ sudo aptitude -y install nasm
Or

sudo get-apt install nasm

This should install NASM on your system.


Steps to write & execute ASM program
Make sure you have set the path of nasm and ld binaries in your PATH
environment variable
1. Open a terminal window showing command prompt.
2. Give the following command at the prompt to invoke the editor
gedit hello.asm
3. Make sure that you are in the same directory as where you saved hello.asm.
4. Type in the program in gedit window, save and exit
5. To assemble the program write the command at the prompt as follows and
press enter key
nasm –f elf32 hello.asm –o hello.o (for 32 bit)
nasm –f elf64 hello.asm –o hello.o (for 64 bit)
6. If the execution is error free, it implies hello.o object file has been created.
7. To link and create the executable give the command as

ld –o hello hello.o
Or
ld -m elf_i386 -s -o hello hello.o
8. To execute the program write at the prompt
./hello
9. “hello world” will be displayed at the prompt

elf 32 & elf64

Executable and Linkable Format is a common standard file format for executable files, object
code.
The ELF file is divided into two parts. 1) ELF header, 2) the file data.
Further, the file data is made up of the Program header table, Section header table, and Data.
The ELF header is always available in the ELF file, while the Section header table is important
during link time to create an executable. The Program header table is useful during runtime to
help load the executable into memory.

The assembly program structure –

Assembly language programs consist of three types of statements:

1. Instructions written in mnemonic form


2. Assembler directives or pseudo-ops
3. Macros
The instructions tell the processor what to do. Each instruction consists of an operation
code(opcode). Each instruction is converted to one machine language instruction by the
assembler.

The assembler directives or pseudo-ops are not executed by the processor but are the
commands to assembler software i.e. additional information to help the assembler for
assembling and so these directives are not converted to machine language instructions.

Macros are used to replace a block of repetitive instructions by single statement. So macros are
used to save time in typing the same block of instructions again & again.

An assembly program can be divided into three sections:


The data section
The bss section
The text section

• The .data section


The data section is used for declaring initialized data or constants. This data does not change at
runtime. You can declare various constant values, file names or buffer size etc
you can also define constants using the EQU directive.
Here you can use the DB, DW, DD, DQ and DT directives.
For example:
section .data
message db 'Hello world!' ;
msglength equ $ - msg ;
buffersize dw 1024 ;

• The .bss section (block starting symbol)


In this we have to write statically allocated variables that are declared but have not been assigned
a value yet. These can be changed at runtime.
You use the RESB, RESW, RESD, RESQ and REST directives to reserve space in memory for
your uninitialized variables, like this:
section .bss
filename resb 255 ; Reserve 255 bytes
number resb 1 ; Reserve 1 byte
bignum resw 1 ; Reserve 1 word (1 word = 2 bytes)
realarray resq 10 ; Reserve an array of 10 quad words

• The .text section


This is where the actual assembly code is written. The .text section must begin with the
declaration global _start, which just tells the kernel where the program execution begins.
Eg.:
section .text
global _start
_start:
Here is the where the program actually begins
...

global directive is specific to NASM assembler.( other assemblers use public directive) It is for
exporting symbols in your code to where it points in the object code generated. Here you
mark _start symbol as global so its name is added in the object code ( .o). The linker (ld) can
read that symbol in the object code and its value so it knows where to mark as an entry point in
the output executable. When you run the executable it starts at where marked as _start in the
code.

Algorithm
( Assumption : user has to enter only 0 to 9 & A to F characters. Because Program is not
checking the validity of the digits in the entered numbers)

1. Reserve 5 memory locations each of 64 bits ( array of 5 * 64 bits)


2. set a counter of 5
3. set a pointer to the start of the array
4. Accept a 64 bit number( i.e.16 digits) & store in array
5. Increment pointer
6. decrement counter
7. if counter is not 0 go to 4
8. set the pointer to the start of the array again
9. Set counter to 5
10. Display a 64 bit number( i.e.16 digits)
11. Increment pointer
12. decrement counter
13. if counter is not 0 go to 10
14. exit the program

print of program
print of output
Experiment 2
Title Length of a string
Problem statement Write an X86/64 ALP to accept a string and to display its length

Theory :
X86/64
X86 is Instruction Set Architecture(ISA) designed by Intel .The ISA is responsible for defining
the set of instructions to be supported by the processor .
ISA describes
1. types of instructions to be supported by the processor
2. maximum length of each type of instruction
3. Format of each type of instruction.

The other ISAs are


ARM ISA by ARM , AMD64 by AMD etc

X86/64 means it supports 64 bit data & 64 bit OS.


Here we can use 64 bit registers with names link RAX, RBX, etc. & for system calls the
instruction to be used is syscall

X86 means it supports 32 bit data & 32 bit OS.


Here we can use 32 bit registers with names link EAX, EBX, etc. & for system calls the
instruction to be used is INT 0x80

System calls
A system call is a method for a computer program to request a service from the kernel of
the operating system on which it is running. The kernel system can only be accessed using
system calls.
Types of system calls based on service required to be executed by OS kernel:
1. Process Control: creating, load, abort, end, execute, process, terminate the process, etc.
2. File Management : creating files, delete files, open, close, read, write,
3. Device Management : read device, write device, get device attributes, release device, etc.
4. Information Maintenance: getting system data, set time or date, get time or date, set
system data etc.
5. Communication : create, delete communication connections, send, receive messages, etc.

Eg:
1. system call for exiting a program

mov rax, 60 ;function to exit or terminate program


mov rdi, 1/0 ; with error status either 0 or 1 ( if no error then set to 0)
syscall ;System call

2. system call for writing on the screen


mov rax,1 ;function number 1 is for output
mov rdi, 1 ; and that output device is Display / monitor
mov rsi, Name of the Variable
mov rdx, length of the variable in bytes
syscall ;System call

3. system call for reading a keys from keyboard


mov rax,0 ;function number 1 is for intput
mov rdi, 1 ; and that input device is keyboard
mov rsi, Name of the Variable , where the read keys are to be stored
mov rdx, length of the variable in bytes ,(the total number of keys to be read including enter key)
syscall ;System call

This system call returns the total number of characters ( keys) entered including enter key

Algorithm:
1. Reserve memory locations to store string given by user
2. Display message to the user to enter a string
3. Accept string inputted by user
4. system call returns size/ length of string
5. convert the length to decimal format
6. display the length

Print of Program source code

Print of Screen shot of output


Experiment 3

Title Find the largest number


Problem statement Write an X86/64 ALP to find the largest of given byte/ Word /
Dword / 64-bit numbers

Theory :
The 80386 supports the 17 data types :
1. Bit: A single bit quantity.
2. Bit Field: A group of up to 32 contiguous bits, which spans a maximum of four bytes.
3. Bit String: A set of contiguous bits, on the 80386 bit strings can be up to 4
gigabits long.
4. signed Byte: A signed 8-bit quantity. ( -128 to +127)
5. Unsigned Byte: An unsigned 8-bit quantity. ( 0 to 255)
6. Signed Integer (Word): A signed 16-bit quantity. ( -32768 to 32767)
7. Unsigned Integer (Word): An unsigned 16-bit quantity ( 0 to 65535)
8. signed Long Integer (Double Word): A signed 32-bit quantity ( -2.147 * 10 9 to 2.147
* 10 9 )
9. Unsigned Long Integer (Double Word): An unsigned 32-bit quantity. (0 to 4.294 * 10 9 )
10. Signed Quad Word: A signed 64-bit quantity.
11. Unsigned Quad Word: An unsigned 64-bit quantity.
12. BCD: a byte contains only one decimal digit (0 to 9)
13. Packed BCD: a byte contains two decimal digits (00 to 99)
14. Offset: A 16- or 32-bit offset which references memory location.
15. Pointer: which consists of a 16-bit segment selector and either a 16- or 32-bit offset.
16. Char: A byte contains ASCII character.
17. String: A contiguous sequence of bytes, words or dwords. A string may contain between
1 byte and 4 Gbytes.
The Intel386 DX has 32 register resources in the following categories:
a. General Purpose Registers :
are for Holding data before & after an instructionexecution.

In an instruction ,The size of the operand (byte, word, double word) is conveyed by the name of
the register itself
➢ EAX means: a 32 bit operand
➢ AX means: a 16 bit operand
➢ AL means: a 8 bit operand.
The size of the source operand and the destination operands must be equal. Index registers ESI &
EDI are used for string ( array) operations
Pointer registers ESP & EBP are used in stack segment
b. Segment registers :
memory is divided in segments which are used to store different parts of program i.e. code ( CS),
stack (SS) & data ( DS, ES, FS, GS)

c. Instruction pointer used to hold the address of next instruction (32 bit EIP or 16 bit IP).

d. Flags register :
control certain operations and indicate some special status of the result after
some arithmetic or logical operations

e. Control Registers
The Intel386 DX has three control registers of 32 bits, CR0, CR2 and CR3, These registers, hold
machine status that for all tasks in the system.
f. System Address Registers
used to access the tables or segmentswhen 80386 is operating in protection model.
g. Debug Registers:
The six debug registers provide on-chip support for debugging.

h. Test Registers: used to control the testing of the Translation Lookaside

Addressing modes
It is the way the operands are specified in an instruction
Control unit in 80386 decides from where to take operand and where to store the result operand
based on addressing mode used in instruction.
The Intel386 DX provides a total of 11 addressing modes for instructions to specifyoperands
1. Register Operand Mode:
• The operand is located in one of the 8-, 16- or 32-bit general registers.
• Eg ADD EAX ,ECX
2. Immediate Operand Mode
in which the operand value is present in the instruction So when instruction is fetched ,it is
fetched along with the operand No separate memory access required to fetch data.
Eg ADD EAX, 500E
The value 500E is added to register AX & result is stored in AX
3. Direct Mode:
The operand’s offset is contained as part of the instruction as an 8-, 16- or 32-bitdisplacement.
EXAMPLE: ADD EAX, [500E] ; here Offset= 500E
4. Register Indirect Mode:
A BASE register contains the address of the operand.
EXAMPLE: MOV EAX, [EDX] & Suppose EDX contains 2CA7 ; here Offset= 2CA7
5. Based Mode:
A BASE register’s contents is added with a DISPLACEMENT to form the operands offset.
EXAMPLE: MOV ECX, [EAX+24] ; Suppose EAX contains 1000 So offset = 1024
6. Index Mode:
An INDEX register’s contents is added with a DISPLACEMENT to form the operands offset.
EXAMPLE: ADD EAX, [ESI + FD] ; Suppose ESI contains 2000 So offset = 20FD
7. Scaled Index Mode:
An INDEX register’s contents is multiplied by a scaling factor ( which can either 1, 2, 4 or
8)which is added to a DISPLACEMENT to form the operands offset.
EXAMPLE: IMUL EBX, [EDI*2]+7 ; Suppose EDI contains 2000 So offset = 4007
8. Based Index Mode:
The contents of a BASE register is added to the contents of an INDEX register to form the
effective address of an operand.
EXAMPLE: MOV EAX, [ECX] [EBX] ; Suppose ECX =2000 ,EBX =3000 So offset = 5000
9. Based Scaled Index Mode:
The contents of an INDEX register is multiplied by a SCALING factor and the result is addedto
the contents of a BASE register to obtain the operands offset.
EXAMPLE: MOV ECX, [EDX*2] [EBP] : suppose EDX = 1000,EBP= 2000 So offset = 4000
10. Based Index Mode with Displacement:
The contents of an INDEX Register and a BASE register’s contents and a DISPLACEMENTare
all summed together to form the operand offset.
EXAMPLE: ADD EDX, [ESI] [EBP+00FFFFF0H] Offset = ESI+ EBP + 00FFFFF0
Possible combinations

11. Based Scaled Index Mode with Displacement:


The contents of an INDEX register are multiplied by a SCALING factor, the result is added tothe
contents of a BASE register and a DISPLACEMENT to form the operand’s offset.
EXAMPLE: MOV EAX, [EDI*4] [EBP+80]
So offset = ( EDI*4 ) + EBP + 80

Algorithm
1. Numbers are stored in contiguous memory locations.( array)
2. set a pointer to the start of array
3. set counter equal to total count of numbers
4. set maximum number ( max) as zero.
5. compare max with number pointed by pointer
6. if max is less than number , set max equal to number
7. Increment pointer
8. decrement counter
9. if counter is not zero , go to 5 else got 10
10. display max as maximum number int the array.

Print of Program
Print of Output
Experiment No 4
Title: Count of positive and negative numbers
Problem Statement: Write an X86/64 ALP to count number of positive and
negative numbersfrom an array.

Theory:
We can use Numbers as only positive ( unsigned) or both positive & negative ( signed )
A byte can be used to represent only positive numbers ( unsigned) , So all 8 bits are used for
magnitude of the number
i.e. 0000 0000 to 1111 1111
So unsigned numbers will have range from 0 to 255.

A byte can also be used to represent a signed number . Then MSB bit indicates whether the
number is positive or negative. If it is 0 , then it is positive. If it is 1 then number is negative.
& remaining 7 bits used for the magnitude of the number.
The negative numbers are represented in 2’s complement form.
Eg
+ 1 is represented as

- 1 is represented as
2’s complement of 0000 0001 is 11111111
So -1 is 11111111.

The number can be checked whether it is positive or negative by checking the MSB bit of the
number.
Various logic can be used to check MSB
1. SUB 0, number & check carry bit value
2. ADD number , 0 & check carry bit value
3. RCL number , 1 & check carry bit value

etc

Algorithm:
1. Numbers are stored in contiguous memory locations.( array)
2. set a pointer to the start of array
3. set counter equal to total count of numbers
4. set positive number count & negative number count to 0.
5. Check the MSB bit of a number pointed by pointer
6. if MSB =0 , increment positive number count else increment negative number count
7. Increment pointer
8. decrement counter
9. if counter is not zero , go to 5 else go to 10
10. display positive number count & negative number count

print of program
print of Output
Experiment 5

Title Registers in protected mode


statement Write X86/64 ALP to detect protected mode and display the values of GDTR,
LDTR, IDTR, TR and MSW Registers also identify CPU type using CPUID
instruction.
Theory:
80386 can operate in one of the following modes , one at a time:

Real mode Protected mode Virtual 8086 mode


It is mode which can be
386 is switched from real
It is mode in which 386 works activated when 386 is already
mode to protected mode when
after reset. in protected mode , by setting
PE bit in CR0 is set to 1.
VM bit in flags register
Purpose = to run multiple
purpose = to set up the purpose = to provide Full
8086 programs but with
processor for Protected Mode features of protection &
protection along with other
Operation. paging
386 programs
Works with full features of works as 8086 but with some
works as 8086
386 features of 386
access to registers except TR
access to all registers except
access to all registers & LDTR, GDTR, control
TR & LDTR.
registers, debug registers
Instructions of protected mode Instructions of protected mode
can’t be used i.e. VERR, All Instructions of protected can’t be used i.e. VERR,
VERW, LAR, LSL LTR, mode can be used VERW, LAR, LSL LTR,
STR, LLDT, SLDT ,ARPL STR, LLDT, SLDT ,ARPL
Memory handling 1 MB for
Physical Memory = 1 MB Physical Memory = 4Gb 8086 programs & 4GB for
386 programs
No logical / virtual memory
logical / virtual memory =
No logical / virtual memory for 8086 programs & 64TB
64TB
for 386 programs
Segment size = 64 KB for
Segment size can vary from 1
Segment size is always 64 KB 8086 programs & 4GB for
byte up to 4GB
386 programs
This mode handles only one
Can do multi tasking Can do multi tasking
task at a time.
No paging of memory paging of memory paging of memory
No protection mechanism protection mechanism protection mechanism
All Virtual 8086 Mode
only The privilege level = 0 All privilege levels ( 0 to 3) programs execute at privilege
(i.e. most privileged) is used are used . level 3, (the level of least
privilege.)

Registers which are to be displayed in this program:


1)Test Registers: are used for the testing of the Translation Look aside Buffer
TR6 is used to set conditions for testing TLB and to start testing.
TR7 returns the result ( status) of TLB testing

2)Control Registers
The 386 has three control registers of 32 bits, CR0, CR2 and CR3, These registers, hold
machine status for all tasks in the system.

CR0: Machine Control Register ( machine status word (MSW))


CR0, shown in Figure , contains 6 bits for control and status purposes.

The low-order 16 bits of CR0 are also known as the Machine Status Word, MSW. LMSW and
SMSW instructions are for the Load and Store the lower 16 bits of CR0.
New Intel386 DX operating systems should use the <MOV CR0, Reg >instruction for the Load
and Store the lower 16 bits of CR0.
The defined CR0 bits are described below.
PG (Paging Enable)
the PG bit is set to enable the on-chip paging unit. When it is reset , it disables the on-chip paging
unit.
TS (Task Switched)
The processor sets this bit automatically every time it performs a task switch. It will never clear
this bit on its own, you can do so with the CLTS instruction.
EM (Emulate Coprocessor)
If this bit is set , then whenever 80386 fetches a floating-point instruction , causes an exception (
because 80386 can’t execute floating-point instructions) . You can use this exception to emulate
floating-point operation by a program ( instead of getting it executed by coprocessor), if you
want.
MP (Monitor Coprocessor)
When this bit is set, the 80386 assumes that a floating-point coprocessor ( 80387 ) is attached to
it.
PE (Protection Enable)
When PE = 0 80386 works in real mode
When PE is set to 1 , 80386 works in protected mode.
Changing PE from 1 to 0 , to switch from protected mode to real mode requires longer sequence
of instructions.
Descriptor of a segment
includes its base address, its length, its type, its privilege level, and some miscellaneous status
information.
A segment descriptor
• Describes a segment
• Must be created for every segment
• Is created by the programmer
• stores a segment’s base address
• stores a segment’s size
• stores a segment’s use /type
• stores a segment’s privilege level
It is 8 byte ( 64 bit)

The actual( exact) format of a segment descriptor is

Descriptor tables

The descriptors of segments are kept together at one place in memory called Descriptor table.

There are three type of Descriptor tables.


1. LDT
2. GDT
3. IDT

1. LDT :
All segments of a single user tasks ( programs) are kept in single Local Descriptor table.( LDT)
when there is multitasking, Each user task’s segments has its own LDT, Thus There can be
many LDT in 80386 system
The LDT may contain only code, data, stack, task gate, and call gate descriptors.
LDTs provide a mechanism for isolating a given task’s code and data segments from the rest of
the operating system. A segment cannot be accessed by a task if its segment descriptor does not
exist in either the current LDT or the GDT. This provides protection for a task’s segments, while
still allowing global data to be shared among tasks.
LDT can hold 1 to max 8192 descriptors ( & since each descriptor takes 8 bytes , the size of
LDT is between 8 bytes and 64K bytes.
2. GDT
The segments of OS & segments which are available to all of the tasks in a system are kept in
global Descriptor table (GDT). GDT contains code and data segments used by the operating
systems and task state segments, and descriptors for the LDTs in a system.
GDT can hold 1 to max 8192 descriptors ( & since each descriptor takes 8 bytes , the size of
GDT is between 8 bytes and 64K bytes.
There can be only one GDT in 80386 system
3. IDT
The segments of all ISRs ( interrupt service routines) are kept in interrupt Descriptor table(IDT)
It can contain maximum 256 descriptors corresponding to 256 ISRs ( interrupt service routines)
IDT contain descriptors of only task gates, interrupt gates, and trap gates.( discussed in unit 4)
IDT can hold 1 to max 8192 descriptors ( & since each descriptor takes 8 bytes , the size of
IDT is between 8 bytes and 64K bytes.
There can be only one IDT in 80386 system

These 3 tables can be stored anywhere in the memory. But the processor needs to know their
starting address and these addresses are in :

GDTR, IDTR & LDTR

The starting address of these 3 tables( GDT, IDT, LDT) has to be loaded in processor in three
registers namely GDTR, IDTR & LDTR, By using instructions LGDT, LIDT & LLDT
respectively.
Note :
GDTR & IDTR hold 32 bit base ( starting) address of GDT & 16 bit limit of GDT.
But LDTR holds descriptor of LDT, which is stored in GDT , from where its base address &
limit is copied in to invisible part of LDTR register. ( means it can’t be accessed by
programmer).

CPUID: ( CPU identification)


This instructions gives various details about processor installed in the system .
Before using this instruction EAX has to be loaded with a number , then CPUID returns related
information in EAX, EBX,EDX ECX, such as
Processor name eg Genuine Intel , Authentic AMD etc ,
processor serial number,
info about cache, TLB etc

Algorithm
1. Read a protected mode register ( using instructions :SMSW, SGDT, SIDT, SLDT, STR,
CPUID)
2. convert each digit of value in register to ASCII ( by adding 30H if digit is less than or equal to
9 or adding 37H if digit is more than 9)
3. display all digits of a register ( after converting them to ASCII in step 2)
4.repeat for all registers

print of program
print of Output

Assignment :
Analyze the values of all protected mode registers
Experiment 6

Title Non overlapped block transfer without using string instructions


Statement Write X86/64 ALP to perform non overlapped block transfer without using string
specific instructions. Block containing data can be defined in the data segment.

Theory :
80386 supports 17 data types:
1.Bit:2.Bit Field: 3. Bit String:4. Signed Byte ( 8 bit ) 5. Unsigned Byte( 8 bit ) 6. Signed Word (
16 bit ) 7. Unsigned Word ( 16 bit 8. Signed Double Word ( 32 bit) 9. Unsigned Double Word (
32 bit) 10. Signed Quad Word (64 bit ) 11. Unsigned Quad Word (64 bit )
12. Unpacked BCD: 13. Packed BCD: 14. Offset: 15. Pointer: 16. Char:
17. String: String is Sequences of separate but related data items stored in consecutive addresses
( can be thought as single dimensional array) A string size can be between 1 byte to 4
Gbytes.The 80386 supports byte strings, wordstrings, and dword strings.

Algorithm
( case 1: source block is at lower address & destination block is at higher address )

Before transfer After transfer


1. initialize two blocks (strings) with 5 bytes , such that both blocks non overlapping.
2. display both strings before transfer
3. set pointer1 to start of source block.
4. set pointer2 to start of destination block
5. set counter =5
6. Copy a byte from location pointed by pointer1 to location pointed by pointer2
7. Increment pointer1 & pointer2 by 1( because data is byte size)
8. Decrement counter
9. If counter is not zero then go to 5 , else go to 10
10. display both strings after transfer.
11. end

Note:
For data of word size, the pointers need to be incremented by two &
For data of double word size, the pointers need to be incremented by four.

Print of program
Print of output
Experiment 7
Title overlapped block transfer using string instructions
Statement Write X86/64 ALP to perform overlapped block transfer using string specific
instructions. Block containing data can be defined in the data segment.

Theory :

For string operation , the default registers used are


➢ The SI or ESI (source index) register
➢ The DI or EDI (destination index) register
➢ The CX (count) register
➢ The AL or AX or EAX register and
➢ The direction flag (DF) in EFLAGS register

80386 has a group of instructions , which can support operations on strings.


MOVSB/ MOVSW/ MOVSD : This instruction copies the memory operand ( byte ,or word or
Double word) specified by DS:SI (or DS:ESI) to the memory location specified by ES:DI (or
ES:EDI). After that, SI and DI (or ESI and EDI) will be incremented ( if DF=0) or decremented (
if DF=1) by 1,2, or 4 based on whether it is byte/ word/ double word.
direction flag can be cleared ( =0) by instruction : CLD
direction flag can be set (=1) by instruction STD
The amount by which ESI , EDI are incremented or decremented is decided by what is the size
of data .
If it is byte they are changed by 1 eg instruction MOVSB
If it is word they are changed by 2 eg instruction MOVSW
If it is Dword they are changed by 4 eg instruction MOVSD
INSB/ INSW/INSD : read data from input device
OUTSB/ OUTW/ OUTD : write data to output device
STOSB/ STOSW/ STOSD (store string ): This instruction will write the contents of the
accumulator (AL, AX or EAX) to the memory location specified by ES:DI (ES:EDI for 32 bit
operations). After that, SI and DI (or ESI and EDI) will be incremented or decremented ( by 1,2,
or 4 based on whether it is byte/ word/ double word), depending on the state of the direction flag.
LODSB / LODSW / LODSD ( load string) : This instruction will load the BYTE, WORD, or
DWORD at DS:SI (or DS:ESI) into the accumulator. After that, SI and DI (or ESI and EDI) will
be incremented or decremented by 1/2/4.
SCASB / SCASW / SCASD (Scan string ) : This instruction compares the value in the
accumulator (AL, AX or EAX) with the contents of the memory location specified by ES:DI (or
ES:EDI). The flags are set according to the results of the comparison . After that, DI or EDI
will be incremented or decremented by 1/2/4.
CMPSB/ CMPSW/ CMPSD ( compare strings) This instruction subtracts the memory location
specified by DS:SI (or DS:ESI) from the operand specified by ES:DI (or ES:EDI), setting the
flags and discarding the result,( same operation as that of CMP instruction , except that cmps is
for strings)). After that, SI and DI (or ESI and EDI) will be incremented or decremented by
1/2/4.
REP prefix
Above instructions can do operation on only one data elements of strings at a time. To repeat the
same operation on all elements of string , a REP prefix can be used . For this the register CX /
ECX is used as counter .The contents of the CX register (ECX for 32 bit operation) will be
decremented and the string instruction repeated until CX goes to 0.
REP MOVSB copy number of bytes pointed by DS:SI to ES:DI . After this ( SI, & DI are
incremented or decremented based on Direction flag), CX/ECX is decremented , & if CX/ECX is
not zero , next byte is copied
We can also combine some condition along with REP
REPE / REPZ Repeat while CX (ECS) is not 0 and ZF is set
REPNE / REPNZ Repeat while CX (ECX) is not 0 and ZF is clear

Algorithm
( case 1: source block is at lower address & destination block is at higher address )

Before transfer After transfer

1. initialize two blocks (strings) with 5 bytes , such that second block overlaps the first
2. display both strings before transfer
3. set SI/ESI to end of source block & set DI/EDI to end of destination block
4. set counter =5
5. set direction flag i.e. set DF=1
6. move a byte from source block to destination block by Using string instruction
REP MOVSB
7. display both strings after transfer.
8. end

Note:
For data of word size, the pointers need to be incremented by two ( & use MOVSW)
For data of double word size, the pointers need to be incremented by four. ( & use MOVSD)
Algorithm
( case 2: source block is at higher address & destination block is at lower address )

Before transfer After transfer

1. initialize two blocks (strings) with 5 bytes , such that second block overlaps the first
2. display both strings before transfer
3. set SI/ESI to start of source block & set DI/EDI to start of destination block
4. set counter =5
5. Clear direction flag i.e. DF=0
6. move a byte from source block to destination block by Using string instruction
REP MOVSB
7. display both strings after transfer.
8. end

Print of program
Print of output
Experiment 8
Title Multiplication
Problem Write 8086 ALP to perform multiplication of two 8-bit numbers. Use
statement Successive Addition & shift & add method

Theory:
80386 has two instructions for multiplication :
1. MUL
It is for Unsigned multiplication (i.e. both numbers are positive numbers)

Format : MUL R/M/X ; multiplicand can be register /memory/immediate number

Eg MUL CL ; multiplies 8 bit AL register with 8 bit CL register & result is stored in 16
;bit AX register

2. IMUL
It is for signed multiplication (multiplication of two numbers which can be positive or negative )
Format : IMUL R/M/X ; multiplier can be register /memory/immediate number
Eg
IMUL CL ; It multiplies AL* CL & result is stored in AX

For both MUL & IMUL, the following rules apply

If source ( multiplier) is byte , it will multiply AL ( multiplicand) & result is stored in AX


If source ( multiplier) is word , it will multiply AX ( multiplicand) & result is stored in DX:AX
If source ( multiplier) is double word , it will multiply EAX( multiplicand) & result is stored in
EDX:EAX

Source (Multiplier) Accumulator Result (Product) [Double-length]


(Multiplicand)

8-bit AL AX
16-bit AX DX:AX
32-bit EAX EDX:EAX

Flags affected are OF & CF , all other flags are indeterminate .

Multiplication can be done by two methods without using MUL or IMUL instruction.

Assumption : both numbers are positive ( unsigned)


For signed Multiplication , we can’t use following methods , but we have to use Booth’s
algorithm
1. Successive addition method

A*B means adding A to itself B times


i.e. The multiplicand is added with itself, multiplier number of times

Eg 1101 * 0110 = 1101 + 1101 + 1101 +1101 +1101 +1101 = 1001110


i.e. ( 13) * ( 6) = 13 + 13 + 13 + 13 + 13 + 13 = (78)

Algorithm for Successive addition method

1. BL holds multiplier = 0000 0110


2. RCX is loaded with 0 , RCX is used to store partial products as well as the final result.
3. RAX holds multiplicand= 0000 1101
4. RCX is added with RAX & result is stored in RCX i.e. RCX + RAX => RCX
5. Decrement BL
6. If BL is not zero then go to 4 else go to 7
7. Display the result which is in RCX after converting it to ASCII

2. Shift/Add Method

• Work out partial product for each bit of multiplier ( if multiplier bit is 0 , the partial
product is 0000 & if multiplier bit is 1 , the partial product is same as multiplicand)
• We need to store all the Partial products, then shift each partial product as shown above
before adding all of them .

Easier way is Shift/Add Method


In this we do running addition i.e. we don’t wait till all partial products are calculated , but we go
on adding partial products as soon as they are calculated .
i.e.
initially Result =0 , then
Result = Result +partial product 1 & shift left
Result = Result +partial product 2 & shift left
And so on

• Initially result =0 , add first partial product to it. Then shift result to left & add next
partial result and so on
• For each 1 in multiplier add and shift ;
• but for each 0 in multiplier , only a shift is required ( because in that case the partial
product is 0000)
Algorithm for Shift/Add Method

1. DL is loaded with 8 . DL is used as counter to repeat the shift& add 8 times as there are
8 bits in multiplier.
2. RCX is used to store partial products as well as the final result.
3. RAX holds multiplicand= 0000 1001
4. BL holds multiplier = 0000 1101
5. BL is shifted to right by SHR instruction . so LSB bit goes into carry flag.
6. If carry flag is 1 then RAX is added to RCX & RAX is shifted to left
a. But if carry flag is 0 , then only RAX is shifted to left
7. Decrement DL
8. If DL is not zero , then go to 5 else go to 9
9. Display the result which is in RCX after converting it to ASCII

Algorithm for procedure main:-


1. Display the following menu for user:

1. Successive Addition
2. Shift & add method
3. Exit
Enter your choice::
2. Accept the choice from user.
3. If (choice =1), then call procedure SUCCADD else go to 4
4 If (choice =2), then call procedure SHIFTADD else call procedure exit.
5. If (choice =3), then exit else go to 1.

Print of Program
Print of output
Experiment 9

Title File statistics


Problem Write X86 ALP to find, a) Number of Blank spaces b) Number of lines c)
statement Occurrence of a particular character. Accept the data from the text file. The
text file has to be accessed during Program_1 execution and write FAR
PROCEDURES in Program_2 for the rest of the processing. Use of PUBLIC
and EXTERN directives is mandatory.

Theory:
Assembler directives:
are directions given to the assembler to take some action or change a setting. Assembler
directives are not instructions of 80386, so assembler does not translate them into machine code.
Or
Assembler directives, also known as pseudo-opcodes, are special commands in assembly
language that are not instructions to be executed by the processor. Instead, they are instructions
to the assembler itself, telling it how to assemble the program.

Number notations
Assembler can accept numbers in decimal , binary, octal , hex format
Assembler can accept characters in ascii format
Decimal notation is 12
Binary notation is 1010b
Hex notation is 0xA2 or 0A2h
Ascii notation is ‘abcd’

.text –This directive tells the assembler that following lines are80386 instructions written in
assembly language format, and the translated machine code is to be written to the code segment.
.data –This directive tells the assembler that following lines are program data. Assembler will
stored them in the data segment.

.label– A label is an address in memory corresponding to either an instruction or data value, so


the programmer can reference an address by a name.

Num1 db 11h ;Here Num1 is label to data 11h , Num1 holds address of data 11h
Back : mov al,dl ;here Back holds address of instruction mov al,dl

DB/DW/DD/DQ
DB ( define byte) – used to declare a byte size variable and to set aside some space in memory
each of byte size& assign initial value.
Eg
Counter db 5 ; single byte is reserved in memory and value stored there is binary of
decimal 5
Array1 db 11h,22h, 33h ,44h ; declares array1 of 4 bytes and initializes as shown
Message1 db “this is a message” ; reserves as many bytes as there are in the message &
;stores there ASCII values
DB is used for values up to 255
DW( define word)- used to declare a word size variable and to set aside some space in memory
each of word size. is used for values up to 65537
DD( define double word) is used for values up to 4GB
DQ( define quad word) is used for values up to 2 64

DB/DW/DD/DQ are written in section .data


Times
is useful in defining arrays with all elements having same value
eg
marks TIMES 5 DB 0 ; five elements are defined all having value 0 , first element offset is in
;marks

RESB/ RESW/ RESD/ RESQ


The reserve directives are used for reserving a space in memory for uninitialized data. ( i.e.
whose value is not decided while writing a program.
RESB ( reserve byte (s) )
RESW ( reserve words (s) )
RESD ( reserve double words (s) )
RESQ ( reserve quad words (s) )

Eg 1
Result resb 4 ; four bytes reserved in memory .offset of first byte is assigned to Result
Eg2
m1db ‘ABCD’
m1lenequ $-m1 ; as seen from figure below, offset of m1 is 0, $ is currently 4 , so $-m1 gives
;value 4 , which is assigned to label m1len

RESB/RESW/RESD/RESQ are written in section .bss

Extrn- is used to tell the assembler that labels or names following this directive are in some
other program.

Public – is used to make a label or name available to other modules ( in those other modules it is
declared as extern)
eg
in one module P1.asm we define count variable &declare it as public as shown below

public count
count db 0

in another module P2.asm ,we declare that count is defined in some other module by

extrn count

we assemble P1.asm & P2.asm separately to create P1.o & P2.o object files
These object files can be linked together by linker using command

Ld -o program p1.o p2.o

This will link P1.o & P2.o to create an executable file by the name ‘program’

Global – it can be used in place of both extern & public

Macro
It is mainly used to achieve modular programming. They are used in the same program where
they are defined. Also It is a way to reduce the work of typing same block of instructions again
& again.
Eg.
We require syscall many times to print result/message on the screen.
mov rax,1
mov rdi,1
mov rsi,msg1
mov rdx,len1
syscall

we can define a macro for this as below:


%macro print 2
mov rax,1
mov rdi,1
mov rsi,%1
mov rdx,%2
syscall
%endmacro

In this we have declared a macro by name ‘print’ & we have to pass two parameters which will
replace %1 & %2 .

Eg
---
----
Print msg1, len1
---
The assembler will replace this line<Print msg1, len1 >by the instructions defined in macro. It
will also replace %1 by msg1 & %2 by len1.
Thus this line<Print msg1, len1 > will be replaced as:
mov rax,1
mov rdi,1
mov rsi, msg1
mov rdx, len1
syscall

Procedure / function / subroutine /subprogram

This is used to divide large program in to smaller modules. A procedure can be used ( called) by
many programs or by a single program many times.
Eg
We require to calculate factorial many times. So we write a separate procedure/ program which
only calculates the factorial. The program which requires factorial , will call this subprogram and
pass a number whose factorial is to be calculated.

Program 1

---
Mov ax, number ; ax is loaded with number whose factorial is to be calculated
Call factorial
----

Factorial program ;this subprogram calculates the factorial of the number which is in ax
---
---
Mov bx, result ;& stores factorial in bx
Ret ;then returns to calling program. The calling program uses the
;factorial which is in bx

Difference between
Macro Procedure
Syntax: Syntax:
%macro macro_name number_of_parameters procedure_name :
<macro body> procedure body
%endmacro ….......................
RET
example example
%Macro_print 2 CALL Factorial
wherever macro name is written , that Macro Assembler does not replace call instruction. So
name is replaced by set of instructions written program size does not increase. Procedure is
in macro . It is done by assembler. assembled separately.
So program size increases.
In a macro, parameters are passed as part of In a procedure, parameters are passed in
macro statement. registers and memory locations.
It is used for small set of instructions ,mostly It is used for large set of instructions, mostly
less than ten instructions. more than ten instructions.

Algorithm

Create three files


abc.txt ; the file from which number of spaces, lines will be counted
P1.asm ; which opens file abc.txt , stores its contents in memory and calls P2.asm
P2.asm ; which counts number of spaces, lines from abc.txt which are now in memory &
displays them

Assumption :
1. Maximum Size of file abc.txt is 200.
2. the count of spaces, lines & occurrence of a characters are all less than 9, to simplify the
program.

Algorithm for P1.asm


1. Open file abc.txt
2. Read the file contents & store it in buffer (in memory)
3. Call procedure spaces ,which is in P2.asm
4. Call procedure lines ,which is in P2.asm
5. Call procedure occur_char, which is in P2.asm

Algorithm for P2.asm

Procedure spaces
1. Set pointer to buffer
2. Read data( character ) pointed by pointer
3. Compare it with ASCII code of space ( i.e. 20h)
4. If match, then increment counter of spaces
5. Increment pointer
6. If all characters not over then go to 2 else go to 7
7. Display count of spaces after converting to ASCII
8. Return to the calling program P1

Procedure lines
1. Set pointer to buffer
2. Read data( character ) pointed by pointer
3. Compare it with ASCII code of line feed ( i.e. 0Ah)
4. If match, then increment counter of lines
5. Increment pointer
6. If all characters not over then go to 2 else go to 7
7. Display count of lines after converting to ASCII
8. Return to the calling program P1

Procedure occr_char
1. Set pointer to buffer
2. Read data( character ) pointed by pointer
3. Compare it with ASCII code of character whose occurrence is to be counted
4. If match, then increment counter of occurrence
5. Increment pointer
6. If all characters not over then go to 2 else go to 7
7. Display count of occurrence after converting to ASCII
8. Return to the calling program P1

Print of program
Print of output
Experiment 10
Title Password check
Problem Write an X86/64 ALP password program that operates as follows:
statement a. Do not display what is actually typed instead display asterisk (“*”).
If the password is correct display, “access is granted” else display “Access
not Granted”

Theory
Password is used for authentication ( i.e. only persons who have been assigned credentials are
allowed to use the system or application after entering correct password)
The password entered by user should not be displayed ( echoed ) on the screen.
But the normal syscall function 0 ( rax=0) echoes the typed in keys.
i.e.
mov rax,0
mov rdi,0
mov rsi,starting memory address to store entered keys
mov rdx,number of keys to read
syscall

How to turn off echo

The struct termios structure is a general terminal interface which provides an interface to
asynchronous communications devices such as keyboard.
The flag c_lflag of the struct termios structure, control higher-level aspects of
input processing such as echoing and the choice of canonical or non canonical input.

tcflag_t ECHO
If this bit is set, echoing of input characters back to the terminal is enabled.
If this bit is cleared, echoing is disabled.

tcflag_t ICANON
If this bit is set, enables canonical input processing mode. Otherwise, input is processed in non
canonical mode.
In canonical input processing mode, No input can be read until <enter> key is pressed.
In noncanonical input mode, the special editing characters such as ERASE ( backspace key and
KILL ( ctrl+C )are ignored and program has to check for <enter > key ( ASCII code 0AH).

Algorithm
Disable echo & canonical mode.
Accept password entered by user while displaying * on screen
Compare entered password with stored password
If match then print “access granted” else print “access denied”
Enable echo & canonical mode.
Algorithm ( detailed)

1. Correct Password is pre stored in buffer ( memory).


2. Read read_stdin_termios structure
3. clear canonical bit & echo bit in local mode flags from that structure( turn of echo &
canonical mode)
4. Write back updated flags using write_stdin_termios structure
5. read a key entered by user
6. check if it is enter key , if yes then go to 8
7. print asterisk i.e. * on the screen & go to 5
8. check if number of characters is entered password are equal to stored password. If not
equal then go to 18
9. set a pointer1 to stored password
10. set a pointer2 to entered password
11. set a counter equal to password length
12. compare a character pointed by pointer1 with a character pointed by pointer2
13. if different then go to 18 else go to 14
14. increment pointer1 & pointer2
15. decrement counter
16. if counter is not zero then go to 12 else go to 17
17. display message “ access granted ” & go to 19
18. display message “ access not granted” & go to 19
19. Read read_stdin_termios structure
20. Set canonical bit & echo bit in local mode flags from that structure
21. Write back updated flags using write_stdin_termios structure
22. Exit the program

Print of program
Print of output

You might also like