0% found this document useful (0 votes)
7 views

PL C6

The document provides an overview of data types in programming, explaining their definitions, roles, and classifications such as primitive, numeric, boolean, and character types. It discusses how data types group similar data values and the operations that can be performed on them, along with the importance of descriptors for managing variable attributes. Additionally, it covers specific data types like integers, floating-point numbers, complex numbers, and decimal types, highlighting their characteristics and applications in programming languages.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

PL C6

The document provides an overview of data types in programming, explaining their definitions, roles, and classifications such as primitive, numeric, boolean, and character types. It discusses how data types group similar data values and the operations that can be performed on them, along with the importance of descriptors for managing variable attributes. Additionally, it covers specific data types like integers, floating-point numbers, complex numbers, and decimal types, highlighting their characteristics and applications in programming languages.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 81

PROGRAMMING LANGUAGES

DATA TYPES

1
Introduction

 A data type defines a collection of data values and a set of


predefined operations on those values.
 Computer programs produce results by manipulating data.
 ALGOL 68 provided a few basic types and a few flexible structure-
defining operators that allow a programmer to design a data
structure for each need.
 A descriptor is the collection of the attributes of a variable.
 In an implementation a descriptor is a collection of memory cells
that store variable attributes.

Data Types

A data type defines a collection of data values and a set of predefined operations
on those values.
→ A data type is a way to group similar kinds of data (like numbers or le ers) and
decide what actions can be done with them (like adding numbers or combining text).
Computer programs produce results by manipulating data.
→ Programs work by taking data, changing or processing it in some way, and then
producing a result.
ALGOL 68 provided a few basic types and a few flexible structure-defining operators
that allow a programmer to design a data structure for each need.
→ The ALGOL 68 programming language gave programmers some basic data types
and special tools to create custom data structures for different purposes.
A descriptor is the collection of the attributes of a variable.
→ A descriptor is a set of details about a variable, like its type, size, and where it is
stored in memory.
In an implementation, a descriptor is a collection of memory cells that store
variable attributes.
→ When a program runs, a descriptor is stored in memory, holding informa on about
a variable so the computer knows how to use it.

2
Introduction

 If the attributes are static, descriptor are required only at


compile time.
 These descriptors are built by the compiler, usually as a part of
the symbol table, and are used during compilation.
 For dynamic attributes, part or all of the descriptor must be
maintained during execution.
 Descriptors are used for type checking and by allocation and
deallocation operations

Data Types

If the attributes are static, descriptors are required only at compile time.
→ If a variable's details (like type and size) do not change, the descriptor is only
needed when the program is being prepared (before it runs).
These descriptors are built by the compiler, usually as a part of the symbol table,
and are used during compilation.
→ The compiler creates these descriptors and stores them in a list (called a symbol
table) to help check and organize variables while the program is being written.
For dynamic attributes, part or all of the descriptor must be maintained during
execution.
→ If a variable's details can change while the program runs, the descriptor must be
kept in memory so the program can update and use it as needed.
Descriptors are used for type checking and by allocation and deallocation
operations.
→ Descriptors help make sure variables are used correctly (type checking) and
manage memory when variables are created or removed.

3
Primitive Data Types

 Those not defined in terms of other data types are called primitive data
types.
 Almost all programming languages provide a set of primitive data types.
 Some primitive data types are merely reflections of the hardware – for
example, most integer types.
 The primitive data types of a language are used, along with one or more
type constructors.

Data Types

Those not defined in terms of other data types are called primitive data types.
→ Primi ve data types are basic types that are not built from other types, like
numbers and letters.
Almost all programming languages provide a set of primitive data types.
→ Most programming languages include basic data types like numbers, text, and
true/false values.
Some primitive data types are merely reflections of the hardware – for example,
most integer types.
→ Some basic data types, like whole numbers (integers), exist because computer
hardware is designed to handle them directly.
The primitive data types of a language are used, along with one or more type
constructors.
→ Basic data types can be combined or modified using special tools (type
constructors) to create more complex data structures.

4
Numeric Types

 Integer

 Floating-point

 Complex

 Decimal

Data Types

5
Integer

 The most common primitive numeric data type is integer.


 The hardware of many computers supports several sizes of integers.
 These sizes of integers, and often a few others, are supported by some
programming languages.
 Java includes four signed integer sizes: byte, short, int, and long.
 C++ and C#, include unsigned integer types, which are types for integer
values without sings.

Data Types

The most common primitive numeric data type is integer.


→ The most commonly used basic number type is the integer, which represents
whole numbers.
The hardware of many computers supports several sizes of integers.
→ Computers can handle different sizes of whole numbers, depending on how much
memory they use.
These sizes of integers, and often a few others, are supported by some
programming languages.
→ Many programming languages allow different integer sizes to match what the
computer can handle.
Java includes four signed integer sizes: byte, short, int, and long.
→ Java provides four types of whole numbers, each with a different range: byte
(smallest), short, int, and long (largest).
C++ and C# include unsigned integer types, which are types for integer values
without signs.
→ C++ and C# also allow whole numbers that cannot be nega ve (unsigned integers),
meaning they only store positive values and zero.

6
Floating-point

 Model real numbers, but only as approximations for most real values.
 On most computers, floating-point numbers are stored in binary, which
exacerbates the problem.
 Another problem is the loss of accuracy through arithmetic operations.
 Languages for scientific use support at least two floating-point types;
sometimes more (e.g. float, and double.)
 The collection of values that can be represented by a floating-point type is
defined in terms of precision and range.
 Precision: is the accuracy of the fractional part of a value, measured
as the number of bits. Figure below shows single and double
precision.
 Range: is the range of fractions and exponents.

Data Types

Floating point
→ A floa ng-point number is a number that can have decimals, like 3.14 or -2.5.
Model real numbers, but only as approximations for most real values.
→ Floa ng-point numbers represent real numbers, but they are not always exact due to how
computers store them.
On most computers, floating-point numbers are stored in binary, which exacerbates the
problem.
→ Computers store floa ng-point numbers in binary (1s and 0s), which can cause small
errors in calculations.
Another problem is the loss of accuracy through arithmetic operations.
→ When doing math with floa ng-point numbers, small errors can build up, making the
result slightly inaccurate.
Languages for scientific use support at least two floating-point types; sometimes more
(e.g., float and double).
→ Programming languages used for science and engineering usually offer at least two types
of floating-point numbers, like float (less precise) and double (more precise).
The collection of values that can be represented by a floating-point type is defined in terms
of precision and range.
→ Floa ng-point numbers have limits on how accurately they store decimal values
(precision) and how big or small the numbers can be (range).
Precision: is the accuracy of the fractional part of a value, measured as the number of bits.
Figure below shows single and double precision.
→ Precision refers to how many decimal places a floa ng-point number can store, depending
on how many bits are used (e.g., single-precision uses fewer bits than double-precision).
Range: is the range of fractions and exponents.
→ Range refers to how large or small the floa ng-point number can be, including both the
decimal part and the exponent (scientific notation).

7
Floating-point

Data Types

This image illustrates the IEEE 754 standard for floating-point representation, which
is used in computer systems to represent real numbers. It shows two formats:

(a) Single Precision (32 bits total)


•1 bit – Sign bit
• Indicates the sign of the number:
• 0 = positive
• 1 = negative
•8 bits – Exponent
• Stores the exponent in biased form (bias = 127).
• The actual exponent = stored exponent − 127.
•23 bits – Fraction (also called Mantissa or Significand)
• Stores the significant digits of the number.
• A leading 1 is implied (not stored), making it a normalized number.
Total: 1 (sign) + 8 (exponent) + 23 (fraction) = 32 bits
(b) Double Precision (64 bits total)
•1 bit – Sign bit
• Same function as in single precision.
•11 bits – Exponent
• Stores the exponent in biased form (bias = 1023).
• The actual exponent = stored exponent − 1023.
•52 bits – Fraction (Mantissa)
• Stores the significant digits with a leading 1 implied.
Total: 1 (sign) + 11 (exponent) + 52 (fraction) = 64 bits

8
Complex

 Some languages support a complex type: e.g., Fortran and Python


 Each value consists of two floats: the real part and the imaginary part
 Literal form (in Python):

(7 + 3j)

where 7 is the real part and 3 is the imaginary part

Data Types

Some languages support a complex type: e.g., Fortran and Python.


→ Some programming languages, like Fortran and Python, allow a special number type called
"complex numbers."
Each value consists of two floats: the real part and the imaginary part.
→ A complex number has two parts: a real number (like 7.0) and an imaginary number (like
3.0j).
Literal form (in Python): (7 + 3j).
→ In Python, you write a complex number using this format: (real + imaginary j), like (7 + 3j).
Where 7 is the real part and 3 is the imaginary part.
→ In (7 + 3j), the number 7 is the real part, and 3j is the imaginary part (where j represents √-
1).
A complex number is a special type of number that has two parts:
1.Real part – A regular number, like 7.
2.Imaginary part – A number that includes j (or i in mathematics), which represents
the square root of -1.
Here, 7 is the real part.
3j is the imaginary part (Python uses j instead of i like in math).

They are useful in engineering, physics, and computer science, especially for signal processing
and electrical circuits.

Python and some other programming languages (like Fortran) have built-in support for
complex numbers, allowing mathematical operations like addition, subtraction,
multiplication, and division.

9
Decimal

 Most larger computers that are designed to support business applications


have hardware support for decimal data types
 Decimal types store a fixed number of decimal digits, with the decimal
point at a fixed position in the value
 These are the primary data types for business data processing and are
therefore essential to COBOL

 Advantage: accuracy of decimal values

 Disadvantages: limited range since no exponents are allowed, and its


representation wastes memory

Data Types

Most larger computers that are designed to support business applications have
hardware support for decimal data types.
→ Big computers used in business applica ons o en have special hardware that
helps them handle decimal numbers more easily and accurately.
Decimal types store a fixed number of decimal digits, with the decimal point at a
fixed position in the value.
→ Decimal numbers are stored with a specific number of digits, and the decimal
point (like 3.14) is always in the same place.
These are the primary data types for business data processing and are therefore
essential to COBOL.
→ Decimal numbers are the main types of data used in business calcula ons, so they
are very important for COBOL, a programming language used for business
applications.
Advantage: accuracy of decimal values.
→ One benefit of using decimal types is that they provide exact accuracy for decimal
numbers, avoiding rounding errors.
Disadvantages: limited range since no exponents are allowed, and its
representation wastes memory.
→ The downside of decimal types is that they can only represent a limited range of
numbers because they don’t use scientific notation (exponents), and they can use up
more memory than necessary.

10
Boolean Types

 Introduced by ALGOL 60
 They are used to represent switched and flags in programs
 The use of Booleans enhances readability
 Range of values: two elements, one for “true” and one for “false”
 One popular exception is C89, in which numeric expressions are used as
conditionals. In such expressions, all operands with nonzero values are
considered true, and zero is considered false
 A Boolean value could be represented by a single bit, but often statured in
the smallest efficiently addressable cell of memory, typically a byte

Data Types

1.Introduced by ALGOL 60
→ Boolean types were first introduced in the programming language ALGOL 60.
2.They are used to represent switches and flags in programs
→ Boolean values are commonly used to indicate whether something is on/off or to
represent conditions like "yes/no" in programs.
3.The use of Booleans enhances readability
→ Using Booleans makes code easier to understand because they clearly express true
or false conditions.
4.Range of values: two elements, one for “true” and one for “false”
→ A Boolean type can only have two values: true (usually represented as 1) or false
(usually represented as 0).
5.One popular exception is C89, in which numeric expressions are used as
conditionals. In such expressions, all operands with nonzero values are considered
true, and zero is considered false
→ In C89, instead of using true or false directly, numbers are used for condi ons. Any
number other than zero is treated as true, and zero is treated as false.
6.A Boolean value could be represented by a single bit, but often stored in the
smallest efficiently addressable cell of memory, typically a byte
→ While a Boolean could technically be stored as just one bit (0 or 1), it is o en
stored in a byte (8 bits) because that’s the smallest unit of memory that a computer
can easily work with.
4o mini

11
Character Types

 Char types are stored as numeric codings (ASCII / Unicode)


 Traditionally, the most commonly used coding was the 8-bit code ASCII
(American Standard Code for Information Interchange)
 An alternative, 16-bit coding: Unicode (UCS-2)
 Java was the first widely used language to use the Unicode character set.
Since then, it has found its way into JavaScript, Python, Perl, C# and F#
 After 1991, the Unicode Consortium, in cooperation with the International
Standards Organization (ISO), developed a 4-byte character code named
UCS-4 or UTF-32, which is described in the ISO/IEC 10646 Standard,
published in 2000

Data Types

Char types are stored as numeric codings (ASCII / Unicode)


→ Character types (like le ers and symbols) are stored in computers as numbers,
using coding systems like ASCII or Unicode.
Traditionally, the most commonly used coding was the 8-bit code ASCII (American
Standard Code for Information Interchange)
→ The most common system for encoding characters was ASCII, which uses 8 bits to
represent each character (allowing 256 possible characters).
An alternative, 16-bit coding: Unicode (UCS-2)
→ An alterna ve to ASCII is Unicode, which uses 16 bits to represent characters,
allowing for a much wider range of symbols and letters from different languages.
Java was the first widely used language to use the Unicode character set. Since
then, it has found its way into JavaScript, Python, Perl, C# and F#
→ Java was the first major programming language to fully adopt Unicode, and it has
since become widely used in other languages like JavaScript, Python, Perl, C#, and F#.
After 1991, the Unicode Consortium, in cooperation with the International
Standards Organization (ISO), developed a 4-byte character code named UCS-4 or
UTF-32, which is described in the ISO/IEC 10646 Standard, published in 2000
→ In 1991, the Unicode Consor um, together with ISO, created a 4-byte character
code called UCS-4 (also known as UTF-32) to support even more characters. This was
officially described in a standard published in 2000.

12
Character String Types

 A character string type is one in which values are sequences of characters

Data Types

A character string type is one in which values are sequences of characters


→ A character string type is a data type where the value consists of a series or
sequence of characters, like words or sentences (e.g., "Hello" or "Goodbye").

13
Design Issues

 The two most important design issues:


– Is it a primitive type or just a special kind of array?
– Is the length of objects static or dynamic?

Data Types

The two most important design issues:


→ When designing data types, there are two main ques ons to consider:
– Is it a primitive type or just a special kind of array?
→ The first ques on is whether the data type is a basic, built-in type (like integer or
boolean) or a more complex structure like an array that holds multiple values.
– Is the length of objects static or dynamic?
→ The second ques on is whether the size of the object (like a list or string) is fixed
(static) or can change as needed (dynamic).

14
String and Their Operations

 Typical operations:
– Assignment
– Comparison (=, >, etc.)
– Catenation
– Substring reference
– Pattern matching
 C and C++ use char arrays to store char strings and provide a collection of
string operations through a standard library whose header is string.h
 Character string are terminated with a special character, null, with is
represented with zero

Data Types

Typical operations:
→ Common ac ons that can be performed on strings include:
– Assignment
→ Assigning a value (like a word or sentence) to a string variable.
– Comparison (=, >, etc.)
→ Comparing strings to check if they are equal, greater than, or less than each other.
– Catenation
→ Combining (or concatena ng) two strings to form a longer string (e.g., "Hello" +
"World" = "HelloWorld").
– Substring reference
→ Accessing a part (substring) of a string, like ge ng the first few characters of
"Hello" (e.g., "Hel").
– Pattern matching
→ Checking if a string matches a specific pa ern, like searching for the word "cat" in
a sentence.
C and C++ use char arrays to store char strings and provide a collection of string
operations through a standard library whose header is string.h
→ In C and C++, strings are stored as arrays of characters, and there is a library (called
string.h) that includes many functions to manipulate strings.
Character strings are terminated with a special character, null, which is represented
with zero
→ In C and C++, strings are marked by a special character called "null" ('\0'), which
signals the end of the string.

15
String and Their Operations

 How is the length of the char string decided?


– The null char which is represented with 0
– Ex:
char str[] = “apples”;

In this example, str is an array of char elements, specifically apples0, where 0


is the null character

Data Types

How is the length of the char string decided?


→ The length of a character string is determined by finding the posi on of the null
character ('\0'), which marks the end of the string.
– The null char which is represented with 0
→ The null character ('\0') is used to signal the end of the string. It is represented by
the value 0.
– Ex: char str[] = “apples”;
→ In this example, the string "apples" is stored as an array of characters: ['a', 'p', 'p',
'l', 'e', 's', '\0'].
In this example, str is an array of char elements, specifically apples0, where 0 is the
null character
→ Here, str is an array of characters that contains the word "apples" followed by the
null character ('\0') at the end, which indicates that the string ends at that point.

16
String and Their Operations

 Some of the most commonly used library functions for character strings in
C and C++ are
– strcpy: copy strings
– strcat: catenates on given string onto another
– strcmp:lexicographically compares (the order of their codes) two
strings
– strlen: returns the number of characters, not counting the null

 In Java, strings are supported by String class, whose value are constant
string, and the StringBuffer class whose value are changeable and are more
like arrays of single characters
 C# and Ruby include string classes that are similar to those of Java
 Python strings are immutable, similar to the String class objects of Java

Data Types

Some of the most commonly used library functions for character strings in C and C++ are
→ Here are some of the most frequently used func ons in C and C++ for working with
character strings:
– strcpy: copy strings
→ The strcpy function copies the contents of one string into another.
– strcat: catenates on given string onto another
→ The strcat function concatenates (or joins) one string to the end of another string.
– strcmp: lexicographically compares (the order of their codes) two strings
→ The strcmp function compares two strings based on the alphabetical order of their
characters, following the ASCII or Unicode values.
– strlen: returns the number of characters, not counting the null
→ The strlen function returns the length of a string by counting the number of characters,
but it doesn’t count the null character ('\0').
In Java, strings are supported by the String class, whose values are constant strings, and
the StringBuffer class whose values are changeable and are more like arrays of single
characters
→ In Java, strings are handled by the String class, where strings cannot be changed once
created, and by the StringBuffer class, which allows modification of the string, like an array of
characters.
C# and Ruby include string classes that are similar to those of Java
→ Both C# and Ruby have string classes similar to Java's, where strings can be handled in a
similar way, with immutable strings or mutable ones.
Python strings are immutable, similar to the String class objects of Java
→ In Python, strings cannot be changed a er they are created, just like Java's String class
objects, which are also immutable.

17
String Length Options

 Static Length String: The length can be static and set when the string is
created. This is the choice for the immutable objects of Java’s String class
as well as similar classes in the C++ standard class library and the .NET class
library available to C# and F#

 Limited Dynamic Length Strings: allow strings to have varying length up to


a declared and fixed maximum set by the variable’s definition, as
exemplified by the strings in C

 Dynamic Length Strings: Allows strings various length with no maximum.


This option requires the overhead of dynamic storage allocation and
deallocation but provides flexibility. Ex: Perl and JavaScript

Data Types

String length options refer to how the size of a string is determined and whether it
can change during the program's execution. The way a string's length is handled
affects how memory is allocated and how flexible the string is in different
programming languages. Here are three common approaches:

1.Static Length String:


→ The length is fixed when the string is created. This op on is used for immutable
objects like Java’s String class and similar classes in the C++ standard library, as well as
in the .NET class library for C# and F#.
2.Limited Dynamic Length Strings:
→ These strings can have varying lengths but are constrained by a maximum size that
is set when the string is declared. This approach is seen in strings in languages like C,
where the maximum length is defined but the string can vary up to that limit.
3.Dynamic Length Strings:
→ These strings can change in size with no predefined maximum length. Although
this offers more flexibility, it requires extra memory management for dynamic storage
allocation and deallocation. Examples of languages using dynamic length strings
include Perl and JavaScript.

18
Evaluation

 Aid to writability

 As a primitive type with static length, they are inexpensive to provide--why


not have them?

 Dynamic length is nice, but is it worth the expense?

Data Types

Evaluation:

1.Aid to writability
→ String types help make programming easier (writability) by providing a way to
handle sequences of characters, which are essential in many programs.
2.As a primitive type with static length, they are inexpensive to provide--why not
have them?
→ If strings have a fixed length (sta c), they are simple and cheap to implement, so
there’s little reason not to use them in programs.
3.Dynamic length is nice, but is it worth the expense?
→ While strings with dynamic length (which can change in size) offer more flexibility,
they come at the cost of additional memory management and processing, so it's
important to weigh whether the benefits justify the extra expense.

19
Implementation of Character String Types

 Static Length String - compile-time descriptor has three fields:


1. Name of the type
2. Type’s length
3. Address of first char

Data Types

The compile-time descriptor for static strings shown in Figure 6.2 consists of three
fields:
1.Static string (Type Name) – Represents the name of the string type, indicating that
it is a static string.
2.Length – Stores the number of characters in the string (i.e., the size of the string).
3.Address – Contains the memory location of the first character of the string.

Static strings are allocated at compile-time and have a fixed length.


The descriptor helps the compiler manage string operations, such as determining
length and accessing characters.
Unlike dynamically allocated strings, these cannot change in size at runtime.

20
Implementation of Character String Types

 Limited Dynamic Length Strings - may need a run-time descriptor for length
to store both the fixed maximum length and the current length (but not in
C and C++ because the end of a string is marked with the null character)

Data Types

These strings can change in length at runtime but have a fixed maximum length.
A run-time descriptor is needed to store:
1.Maximum length – The upper limit of how long the string can be.
2.Current length – The actual length of the string at any given time.

Why Not in C and C++?


•C and C++ do not need a run-time descriptor for this because they use the null
character (\0) to mark the end of a string.
•Instead of storing the length separately, functions like strlen() count characters until
they reach \0.

21
Implementation of Character String Types

 Dynamic Length Strings


– Need run-time descriptor because only current length needs to be stored
– Allocation/deallocation is the biggest implementation problem. Storage to
which it is bound must grow and shrink dynamically
– There are three approaches to supporting allocation and deallocation:

1. Strings can be stored in a linked list, so that when a string grows, the newly
required cells can come from anywhere in the heap
• The drawbacks to this method are the extra storage occupied by the
links in the list representation and necessary complexity of string
operations0
• String operations are slowed by the required pointer chasing

Data Types

Implementation of Character String Types


Dynamic Length Strings
Dynamic length strings can change in size during program execution, requiring specific
management for memory allocation and deallocation. Here's an explanation of their
implementation:
Need run-time descriptor because only current length needs to be stored
→ Since the length of dynamic strings can change, a descriptor (a piece of metadata) is
needed during execution to store the current length of the string, rather than a fixed length.
Allocation/deallocation is the biggest implementation problem. Storage to which it is
bound must grow and shrink dynamically
→ The most challenging part of implemen ng dynamic strings is managing memory. The
memory used by the string must be able to expand when more characters are added and
shrink when characters are removed.

There are three approaches to supporting allocation and deallocation:


1. Strings can be stored in a linked list, so that when a string grows, the newly
required cells can come from anywhere in the heap
→ In this method, the string is stored in a linked list, allowing parts of the string
to grow or shrink dynamically as needed, using available memory (heap).
Drawbacks of this method: → The linked list requires extra memory for the
links between elements, and the complexity of managing these links adds to the
difficulty of string operations. Additionally, string operations can become slower
because the program must follow (or "chase") the pointers in the list to access the
string's contents.

22
Implementation of Character String Types

2. Store strings as arrays of pointer to individual character allocated in the


heap
• This method still uses extra memory, but string processing can be
faster that with the linked-list approach

3. Store strings in adjacent storage cells


• “What about when a string grows?” Find a new area of memory and
the old part is moved to this area.
• Allocation and deallocation is slower but using adjacent cells results in
faster string operations and requires less storage. This approach is the
one typically used

Data Types

2. Store strings as arrays of pointers to individual characters allocated in the heap


→ In this method, the string is stored as an array where each element is a pointer to a
character, and the actual characters are allocated in the heap. While this method still
requires extra memory for the pointers, string processing can be faster than the
linked-list approach because it avoids pointer chasing and directly accesses each
character.

3. Store strings in adjacent storage cells


→ This approach stores all characters of a string in consecu ve memory cells,
meaning the characters are stored next to each other in memory. When the string
grows in size, a new memory area is allocated, and the existing string is moved to this
new location.
•“What about when a string grows?” Find a new area of memory and the old part is
moved to this area.
→ When the string needs more space, a new block of memory is allocated to
accommodate the larger string, and the data from the old memory block is copied to
the new one.
•Allocation and deallocation is slower but using adjacent cells results in faster string
operations and requires less storage. This approach is the one typically used
→ While alloca ng and dealloca ng memory in this method can be slower due to the
need to find a new memory block when the string grows, the advantage is that it
allows faster string operations (such as accessing and modifying characters) and
requires less memory overall, making it a commonly used approach in many
programming languages.

23
Enumeration Types

 All possible values, which are named constants, are provided, or


enumerated, in the definition
 Enumeration types provide a way of defining and grouping collections of
named constants, which are called enumeration constants
 C# example

enum days {mon, tue, wed, thu, fri, sat, sun};

The enumeration constants are typically implicitly assigned the integer values,
0, 1, …, but can explicitly assigned any integer literal in the type’s definition

Data Types

Enumeration Types
Enumeration types are a way to define a set of named constants that represent
specific values. Here’s an explanation of the concept:
•All possible values, which are named constants, are provided, or enumerated, in
the definition
→ In an enumera on type, you define a list of possible values that a variable of this
type can have. Each of these values is given a name (constant), making the code more
readable.
•Enumeration types provide a way of defining and grouping collections of named
constants, which are called enumeration constants
→ Enumera ons allow you to group related constant values together under a single
type, called enumeration constants. This helps organize and simplify code.

enum days { mon, tue, wed, thu, fri, sat, sun };


→ This is an example in C# where the enum keyword is used to define an
enumeration type called days, with the values representing the days of the week.
The enumeration constants are typically implicitly assigned the integer values, 0, 1,
…, but can be explicitly assigned any integer literal in the type’s definition
→ By default, each enumera on constant is assigned an integer star ng from 0 (e.g.,
mon = 0, tue = 1, etc.). However, you can manually assign specific integer values to
these constants if needed, giving you more control over their values.

24
Designs

 In languages that do not have enumeration types, programmers usually


simulate them with integer values
 For example, C did not have an enumeration type. We might use 0 to
represent blue, 1 to represent red, and so forth. These values could be
defined as follows:

int red = 0, blue = 1;

Data Types

Designs
In some programming languages, where enumeration types are not available,
programmers use integer values to simulate them. Here's how it's done:
•In languages that do not have enumeration types, programmers usually simulate
them with integer values
→ If a programming language doesn’t support enumera on types, developers o en
use regular integer variables to represent the different values of an enumeration,
giving each value a unique integer.
•For example, C did not have an enumeration type. We might use 0 to represent
blue, 1 to represent red, and so forth. These values could be defined as follows:
→ In the C language, which didn’t have built-in enumeration types, developers could
use integer values to represent different colors. Here, 0 would represent blue and 1
would represent red.

int red = 0, blue = 1;


•→ This code defines two integer variables red and blue, where 0 is used for blue and
1 for red. This simulates an enumeration using integers.

25
Designs

 In C++, we could have the following:

enum colors {red, blue, green, yellow, black};


colors myColor = blue, yourColor = red;

The colors type uses the default internal values for the enumeration constants,
0, 1, …, although the constants could have been assigned any integer literal.

Data Types

In C++, we could have the following:

enum colors { red, blue, green, yellow, black };


colors myColor = blue, yourColor = red;

num colors { red, blue, green, yellow, black };


→ This defines an enumera on type called colors, which includes five possible values:
red, blue, green, yellow, and black. By default, each of these values is assigned an
integer starting from 0 (i.e., red = 0, blue = 1, green = 2, yellow = 3, and black = 4).
colors myColor = blue, yourColor = red;
→ This declares two variables of type colors: myColor and yourColor. The variable
myColor is assigned the value blue (which corresponds to the integer value 1), and
yourColor is assigned the value red (which corresponds to the integer value 0).
The colors type uses the default internal values for the enumeration constants, 0, 1,
…, although the constants could have been assigned any integer literal.
→ By default, the enumera on constants are assigned integer values star ng from 0
and incrementing by 1 (i.e., red = 0, blue = 1, etc.).

26
Designs

 In 2004, an enumeration type was added to Java in Java 5.0. All


enumeration types in Java are implicitly subclasses of the predefined class
Enum
 Interestingly, none of the relatively recent scripting languages include
enumeration types.
– These included Perl, JavaScript, PHP, Python, Ruby, and Lua
– Even Java was a decade old before enumeration types ware added

Data Types

In 2004, an enumeration type was added to Java in Java 5.0.


•All enumeration types in Java are implicitly subclasses of the predefined class
Enum
→ In Java, star ng with version 5.0, enumera on types were introduced. These types
automatically inherit from the Enum class, which is a built-in class in Java. This
inheritance provides certain built-in methods and functionality for working with
enumerations.
Interestingly, none of the relatively recent scripting languages include enumeration
types.
•These included Perl, JavaScript, PHP, Python, Ruby, and Lua
→ Some of the popular scrip ng languages like Perl, JavaScript, PHP, Python, Ruby,
and Lua do not have built-in support for enumeration types, despite being widely
used for scripting and development.
•Even Java was a decade old before enumeration types were added
→ Java, a well-established programming language, did not have enumeration types
for the first ten years of its existence. The feature was added much later, in Java 5.0
(released in 2004).

27
Array Types

 An array is a homogeneous aggregate of data elements in which an


individual element is identified by its position in the aggregate, relative to
the first element.
 The individual data elements of an array are of the same type.
 References to individual array elements are specified using subscription
expressions.
 If any of the subscript expressions in a reference include variables, then the
reference will require an addition run-time calculation to determine the
address of the memory location being referenced.

Data Types

Array Types
•An array is a homogeneous aggregate of data elements in which an individual
element is identified by its position in the aggregate, relative to the first element.
→ An array is a collec on of elements that are all of the same type (homogeneous).
Each element in the array can be accessed by its position, or index, starting from the
first element.
•The individual data elements of an array are of the same type.
→ All elements in an array must be of the same data type, like integers, floats, or
strings. For example, an array of integers can only hold integer values.
•References to individual array elements are specified using subscription
expressions.
→ To access an element in the array, you use a subscrip on expression, which is
typically written using square brackets [] containing the index number. For example,
arr[0] refers to the first element in the array arr.
•If any of the subscript expressions in a reference include variables, then the
reference will require an additional run-time calculation to determine the address
of the memory location being referenced.
→ When the index used to access an array element is stored in a variable, the
program must compute the memory address of the element at that index during
runtime. This makes the access slightly slower, as it requires additional calculations to
find the correct memory location.

28
Design Issues

 The primary design issues specific to arrays are the following:


– What types are legal for subscripts?
– Are subscripting expressions in element references range checked?
– When are subscript ranges bound?
– When does allocation take place?
– Are ragged or rectangular multidimensional arrays allowed, or both?
– Can arrays be initialized when they have their storage allocated?
– What kinds of slices are allowed, if any?

Data Types

Design Issues
The primary design issues specific to arrays are the following:
•What types are legal for subscripts?
→ This ques on addresses what types of values can be used to
access elements in an array. For example, can you use an integer,
floating point, or string as the index, or must it be a specific type like
integers?
•Are subscripting expressions in element references range
checked?
→ This refers to whether the program checks if the index used to
access an array element is within valid bounds (e.g., checking if the
index is not less than 0 or greater than the array length). If range
checking is enabled, the program will throw an error if an invalid
index is accessed.
•When are subscript ranges bound?
→ This addresses when the size or range of indices that can be used
for accessing array elements is determined. It could be done at
compile-time (before the program runs) or at runtime (while the
program is running), affecting how flexible or efficient the array is.
•When does allocation take place?
→ This refers to when the memory for the array is actually allocated.
It could be allocated when the program is compiled, or dynamically

29
during execution. This affects how arrays are managed
and how memory is used.
•Are ragged or rectangular multidimensional arrays
allowed, or both?
→ This addresses whether mul dimensional arrays (like
matrices) can have rows with varying lengths (ragged
arrays) or must all have the same length in each
dimension (rectangular arrays). Some languages support
only rectangular arrays, while others allow ragged arrays
as well.
•Can arrays be initialized when they have their storage
allocated?
→ This ques on asks if you can assign ini al values to an
array when it is created, or if the array will be empty
until later assignments. Some languages allow
initialization at the time of allocation, while others may
not.
•What kinds of slices are allowed, if any?
→ A "slice" refers to a por on of an array (such as a
subarray or a subset of elements). This question
concerns whether languages support extracting and
working with portions of arrays, and how flexible or
complex these slices can be (e.g., specifying ranges of
indices, skipping elements, etc.).

29
Arrays and Indices

 Indexing (or subscripting) is a mapping from indices to elements.


 The mapping can be shown as:

array_name (index_value_list)  an element

 Ex, Ada assignment statement:

Sum := Sum + B(I);

Because ( ) are used for both subprogram parameters and array subscripts in
Ada, this results in reduced readability.

Data Types

Arrays and Indices


•Indexing (or subscripting) is a mapping from indices to
elements:
→ Indexing refers to the process of accessing an element
from an array by specifying its position (index) in the
array. The index is used to "map" or retrieve the correct
element.
•The mapping can be shown as:
→ array_name (index_value_list) → an element
This means that given an array name and an index (or list
of indices), you can access the corresponding element.
For example, in an array B with an index I, B(I) retrieves
the element at position I.
•Ex, Ada assignment statement:

30
→ Sum := Sum + B(I);
This is an example of indexing in Ada. The value of the
array element B(I) at index I is added to Sum, and the
result is stored back in Sum.
•Because ( ) are used for both subprogram parameters
and array subscripts in Ada, this results in reduced
readability:
→ In Ada, parentheses are used for both
function/subprogram parameters and array subscripts,
which can sometimes confuse readers, as it is unclear
whether the parentheses are referring to an array index
or a function call. This can affect the readability of code,
especially when it is not immediately obvious whether
something is an array access or a function call.

30
Arrays and Indices

 C-based languages use [ ] to delimit array indices.


 Two distinct types are involved in an array type:
– The element type, and
– The type of the subscripts.
 The type of the subscript is often a sub-range of integers.
 Among contemporary languages, C, C++, Perl, and Fortran don’t specify
range checking of subscripts, but Java, ML, and C# do.
 In Perl, all arrays begin with at sign (@), because array elements are always
scalars and the names of scalars always being with dollar signs ($),
references to array elements use dollar signs rather that at signs in their
names. For example, for the @list, the second element is referenced with
$list[1].

Data Types

C-based languages use [ ] to delimit array indices:


•In languages like C, C++, Perl, and others, square brackets [ ] are used to access or reference
elements within an array. For example, array[2] accesses the third element in the array.
Two distinct types are involved in an array type:
•The element type refers to the type of data that the array stores (e.g., integer, float, string).
•The type of the subscripts refers to the type of value used to access elements in the array,
typically an integer or a sub-range of integers (like 0 to 10).
The type of the subscript is often a sub-range of integers:
•The index or subscript used to reference elements in an array usually falls within a defined
range, like integers from 0 to the array length minus 1.
Among contemporary languages, C, C++, Perl, and Fortran don’t specify range checking of
subscripts, but Java, ML, and C# do:
•In languages like C, C++, Perl, and Fortran, if you try to access an array element using an
invalid index (out of range), it won't automatically give an error. It might access random
memory locations, which could cause bugs or crashes. On the other hand, languages like
Java, ML, and C# check the validity of array indices at runtime and will throw an error if you
try to access an out-of-range element.
In Perl, all arrays begin with at sign (@), because array elements are always scalars and the
names of scalars always begin with dollar signs ($), references to array elements use dollar
signs rather than at signs in their names:
•In Perl, arrays are denoted with an at sign @, while individual elements are referenced with
a dollar sign $. This is because arrays in Perl store scalar values, and scalars (individual
elements) are prefixed with a dollar sign.
•Example:
• @list is the array,
• $list[1] refers to the second element in the array list.
This naming convention helps distinguish between the array itself (@) and individual
elements within the array ($).

31
Array Initialization

 Some language allow initialization at the time of storage allocation.


 Usually just a list of values that are put in the array in the order in which
the array elements are stored in memory.
 C, C++, Java, and C# allow initialization of their arrays. Consider the
following C declaration:

int list [] = {4, 5, 7, 83}


The compiler sets the length of the array.

 Character Strings in C & C++ are implemented as arrays of char. These


arrays can be initialized to string constants, as in

char name [] = “Quellos”; //how many elements in array name?

The array will have 8 elements because all strings are terminated with a null
character(zero), which is implicitly supplied by system for string constants.

Data Types

Array Initialization (Simple Explanation):


•Initialization at the time of storage allocation: Some programming languages allow you to
set the values of an array when it is created, instead of assigning values later.
•List of values: The array elements are initialized with a list of values, and these values are
stored in memory in the order you provide.

•Example in C:
int list[] = {4, 5, 7, 83};
•Here, the array list is initialized with values 4, 5, 7, and 83.
•The compiler automatically sets the array length based on the number of values provided.
In this case, list will have 4 elements.
Character Strings in C & C++:
•In C and C++, strings are stored as arrays of char.
•For example:
char name[] = "Quellos";

• This initializes the name array with the string "Quellos".


• The array will have 8 elements because strings in C are automatically terminated
with a special null character (\0 or zero) at the end, marking the end of the
string. So, the string "Quellos" has 7 characters plus 1 for the null character,
making it 8 elements in total.

In summary, array initialization allows you to set values when you create the array, and in
some cases, like strings, the system adds special characters like the null character to mark the
end.

32
Array Initialization

 Arrays of strings in C and C++ can also be initialized with string literals. For
example,

char *names [] = {″Quellos″, ″Carlos″, ″Jr″];

 In Java, similar syntax is used to define and initialize an array of references


to String objects. For example,

String [] names = [“Quellos”, “Carlos”, “Jr”];

Data Types

Arrays of Strings in C, C++, and Java (Simple Explanation):


•C and C++:
• In C and C++, you can initialize an array of strings by
using an array of pointers to characters (each string is
a pointer to the first character in the string).

Example
char *names[] = {"Quellos", "Carlos", "Jr"};
Here, names is an array of pointers to strings (character arrays).
Each element in the array points to a string literal (like
"Quellos", "Carlos", and "Jr").
This means names[0] points to "Quellos", names[1] points to
"Carlos", and names[2] points to "Jr".

ava:
•In Java, you can use a similar syntax to initialize an array of

33
String objects.
•Example:
String[] names = {"Quellos", "Carlos", "Jr"};
In this case, names is an array of references to String objects.
Each element in the array holds a reference to a String object,
and the array is initialized with string literals ("Quellos",
"Carlos", and "Jr").

33
Array Operations

 The most common array operations are assignment, catenation,


comparison for equality and inequality, and slices.
 The C-based languages do not provide any array operations, except
thought methods of Java, C++, and C#.
 Perl supports array assignments but does not support comparisons.
 Python’s arrays’ are called lists, although they have all the characteristics of
dynamic arrays. Because the objects can be of any types, these arrays are
heterogeneous. Python’s array assignments, but they are only reference
changes. Python also supports array catenation and element membership
operations
 Ruby also provides array catenation
 APL provides the most powerful array processing operations for vectors
and matrixes as well as unary operators (for example, to reverse column
elements)

Data Types

Explanation of Each Sentence:


1."The most common array operations are assignment, catenation,
comparison for equality and inequality, and slices."
1. This sentence introduces the basic operations that can be
performed on arrays:
1. Assignment: Assigning a value to an array element.
2. Catenation: Joining two arrays together (also known as
concatenation).
3. Comparison for equality and inequality: Checking if two
arrays are equal or not.
4. Slices: Creating subarrays or extracting parts of an array.
2."The C-based languages do not provide any array operations, except
through methods of Java, C++, and C#."
1. This means that languages like C, C++, and Java do not have built-in
operations for arrays (like concatenation or slicing). However, these
operations can be performed using methods or functions in Java,
C++, and C#.
3."Perl supports array assignments but does not support comparisons."
1. Perl allows you to assign values to arrays, but it doesn't provide a
built-in way to directly compare arrays for equality or inequality.
For example, you can't compare two arrays with a simple operator
like ==.
4."Python’s arrays are called lists, although they have all the characteristics
of dynamic arrays."
1. In Python, the equivalent of arrays is called lists. These lists are

34
dynamic, meaning their size can change
(elements can be added or removed), similar to
dynamic arrays in other languages.
5. "Because the objects can be of any types, these
arrays are heterogeneous."
1. Python's lists can contain elements of different
types (e.g., integers, strings, or other objects),
which makes them heterogeneous (as opposed
to arrays in other languages, which are typically
homogeneous, meaning all elements are of the
same type).
6."Python’s array assignments, but they are only
reference changes."
1. When you assign one list to another in Python,
you're not copying the elements of the list.
Instead, both variables point to the same list in
memory. This is called reference assignment,
meaning the list is not duplicated, but rather the
reference to the list is assigned to the new
variable.
7."Python also supports array catenation and element
membership operations."
1. In Python, you can concatenate (combine) lists
using the + operator, and you can also check if
an element is in a list using the in keyword
(membership test).
8."Ruby also provides array catenation."
1. Ruby also supports array concatenation,

34
allowing you to join arrays together using
methods or operators (like +).
9. "APL provides the most powerful array processing
operations for vectors and matrices as well as unary
operators (for example, to reverse column elements)."
1. APL (A Programming Language) is a language
known for its powerful array manipulation
capabilities. It supports operations for handling
vectors (one-dimensional arrays) and matrices
(two-dimensional arrays), and it includes special
operators (like unary operators) that allow for
advanced operations, such as reversing columns
in a matrix.

34
Associative Arrays

 An associative array is an unordered collection of data elements that are


indexed by an equal number of values called keys.
 So each element of an associative array is in fact a pair of entities, a key
and a value.
 Associative arrays are supported by the standard class libraries of Java,
C++, C#, and F#.
 Example: In Perl, associative arrays are often called hashes. Names begin
with %; literals are delimited by parentheses. Hashes can be set to literal
values with assignment statement, as in

%salaries = (“Gary” => 75000, “Perry” => 57000, “Mary” => 55750, “Cedric” =>
47850);

Data Types

"An associative array is an unordered collection of


data elements that are indexed by an equal number
of values called keys."
•An associative array is a data structure that stores
values, but instead of using numerical indices like
traditional arrays, it uses keys (which are unique
identifiers) to access the values. The key-value pair
structure means that each element in the array has a
key that maps to its corresponding value. Unlike
traditional arrays, the elements in an associative array
don't follow any specific order.
"So each element of an associative array is in fact a
pair of entities, a key and a value."
•In an associative array, every element consists of two
parts: a key and its associated value. This key-value
pair allows data to be retrieved or modified by
referencing the key rather than using an index.

35
"Associative arrays are supported by the standard
class libraries of Java, C++, C#, and F#."
•Modern programming languages like Java, C++,
C#, and F# provide built-in support for associative
arrays, typically through their standard libraries.
These languages have specialized data structures
(such as HashMaps or Dictionaries) that implement
associative arrays.
"Example: In Perl, associative arrays are often
called hashes. Names begin with %; literals are
delimited by parentheses."
•In Perl, associative arrays are referred to as
hashes. A hash is identified by a percent sign (%) at
the beginning of its name. When writing hash
literals (literal key-value pairs), they are enclosed in
parentheses.

"Hashes can be set to literal values with


assignment statement, as in %salaries = (“Gary”
=> 75000, “Perry” => 57000, “Mary” => 55750,
“Cedric” => 47850);"
•Here is an example of how you can assign values
to an associative array (hash) in Perl. The =>
operator is used to assign keys (names) to their

35
respective values (salaries). In this example, the
%salaries hash contains key-value pairs where the
keys are the names ("Gary", "Perry", etc.) and the
values are their salaries (75000, 57000, etc.).

35
Associative Arrays

 - Subscripting is done using braces and keys. So an assignment of 58850 to


the element of %salaries with the key “Perry” would appear as follows:

$salaries{“Perry”} = 58850;

 - Elements can be removed with delete operator, as in the following:

delete $salaries{“Gary”};

 - Elements can be emptied by assigning the empty literal, as in the


following:

@salaries = ();

Data Types

"Subscripting is done using braces and keys. So an assignment


of 58850 to the element of %salaries with the key “Perry”
would appear as follows: $salaries{“Perry”} = 58850;"
•In Perl, subscripting (or referencing array elements) is done
using curly braces {} and the key. In this example, the key is
"Perry", and the value being assigned to that key is 58850. This
assignment updates the %salaries hash (associative array),
setting the salary of "Perry" to 58850.
"Elements can be removed with delete operator, as in the
following: delete $salaries{“Gary”};"
•To remove an element from an associative array in Perl, the
delete operator is used. The key "Gary" is specified in curly
braces {} to remove that specific key-value pair from the
%salaries hash.
"Elements can be emptied by assigning the empty literal, as in
the following: @salaries = ();"

36
•To empty an associative array (or a list) in Perl, you can assign
it an empty list (). This clears all elements in the @salaries
array. Note that this applies to arrays, and if you wanted to
empty a hash, you would use the same method for hashes (e.g.,
%salaries = ();).

36
Associative Arrays

 Python’s associative arrays, which are called dictionaries, are similar


to those of Perl, except the values are all reference to objects.

 PHP’s arrays are both normal arrays and associative array.

 A Lua table is an associate array in which both the keys and the values
can by any type.

 C# and F# support associative arrays through a .NET class.

Data Types

"Python’s associative arrays, which are called dictionaries, are similar to those of
Perl, except the values are all reference to objects."
•In Python, associative arrays are called dictionaries. They are similar to Perl's
hashes, as both store key-value pairs. However, in Python, all values in the dictionary
are references to objects. This means that when you assign a value to a key, the value
refers to an object, and changes to the object will be reflected wherever that
reference is used.
"PHP’s arrays are both normal arrays and associative array."
•In PHP, arrays are flexible and can act as both normal arrays (indexed by integers)
and associative arrays (indexed by strings or other types of keys). This means PHP
arrays can behave in different ways depending on how they are used, making them
very versatile.
"A Lua table is an associative array in which both the keys and the values can be
any type."
•In Lua, the primary data structure for associative arrays is called a table. Unlike in
some other languages, in Lua, both keys and values in a table can be of any type
(e.g., strings, numbers, functions, or even other tables). This provides great flexibility
in how tables can be used.
"C# and F# support associative arrays through a .NET class."
•Both C# and F#, which are part of the .NET framework, support associative arrays
through the use of a predefined class in .NET. In C#, for example, the
Dictionary<TKey, TValue> class is used to store key-value pairs, where the keys and
values can be of any type.

37
Record Types

 A record is a possibly heterogeneous aggregate of data elements in


which the individual elements are identified by names.

 In C, C++, and C#, records are supported with the struct data type. In
C++, structures are a minor variation on classes.

Data Types

"A record is a possibly heterogeneous aggregate of data elements in which the


individual elements are identified by names."
•A record is a data structure that can hold multiple data elements (also called fields
or members), and each element can be of a different type (this makes it
heterogeneous). Unlike arrays (which use positions to identify elements), a record
identifies each element by a name (also known as a field or member name). For
example, a record for a person might have fields like name, age, and address, each
holding different types of data.
"In C, C++, and C#, records are supported with the struct data type."
•In C, C++, and C#, a record is implemented using the struct (short for structure) data
type. The struct allows grouping different data types under one name. For example,
in C, you could define a struct to hold information about a person, such as their
name, age, and address.
"In C++, structures are a minor variation on classes."
•In C++, structures (or struct) are similar to classes but with a key difference: by
default, members of a struct are public (accessible directly), while members of a class
are private (not accessible unless explicitly stated). However, both structs and classes
in C++ can have members (fields) and functions, so the only difference is their default
access control. In many ways, structs in C++ are just simpler, more lightweight
versions of classes.

38
Definitions of Records

 The fundamental difference between a record and an array is that


record elements, or fields, are not referenced by indices. Instead, the
fields are named with identifier, and references to the fields are made
using these identifiers.

int arr[3] = {10, 20, 30};


printf("%d", arr[1]); // Accessing the second element using index

struct Person {
char name[50];
int age;
};

struct Person person1 = {"Alice", 30};


printf("%s", person1.name); // Accessing the 'name' field by its name

Data Types

The statement highlights the fundamental difference between a record and an


array:
•Arrays: In arrays, elements are identified by their position (or index) in the array. The
elements are referenced using numeric indices. For example, in a one-dimensional
array arr[0], the index 0 specifies the first element of the array.
•Records: In records, elements (or fields) are identified by names (or identifiers)
rather than indices. Each field within a record is given a unique name, and references
to the fields are made using these names. For example, in a record representing a
"Person," you might have fields like name, age, and address, and each field is
accessed by its name, such as person.name or person.age.

39
Tuple Types

 A tuple is a data type that is similar to a record, except that the elements are not named
 Python
– Closely related to its lists, but tuples are immutable
– If a tuple needs to be changed, it can be converted to an array with the list function
– Create with a tuple literal
myTuple = (3, 5.8, ′apple′)
– Note that the elements of a tuple need not be of the same type
– The elements of a tuple can be referenced with indexing in brackets, as in the
following:
myTuple[1]
This references the first element of the tuple, because tuple indexing begins at 1
– Tuple can be catenated with the plus (+) operator – They can be deleted with del
statement
newTuple = (1, 2) + (3, 4) # Result: (1, 2, 3, 4)

Data Types

Tuple as a Data Type:


•A tuple is similar to a record in that it groups multiple elements
together. However, unlike records, the elements in a tuple are not
named (i.e., they don’t have identifiers like fields in a record).
Instead, they are ordered and accessed by their position in the
tuple.
Tuples in Python:
•In Python, tuples are closely related to lists, but with one key
difference: tuples are immutable, meaning their elements cannot
be modified after creation.
•If you need to change the contents of a tuple, you can convert it
into a list using the list() function, which allows for modifications.
Creating a Tuple:
•A tuple can be created using tuple literals. For example: myTuple =
(3, 5.8, 'apple')
•This creates a tuple with three elements: an integer 3, a floating-
point number 5.8, and a string 'apple’.

40
Mixed Types in Tuples:
•Unlike arrays, the elements of a tuple need not be of the same
type. For example, a tuple can contain integers, strings, floats,
etc., all together in one tuple.
Accessing Elements in a Tuple:
•The elements of a tuple can be accessed using indexing in
brackets. However, tuple indexing in Python starts from 0, not
1. So: myTuple[0] # This references the first element, which is 3
in this case.
Tuple Operations:
•Concatenation: Tuples can be concatenated (combined) using
the + operator. For example: newTuple = (1, 2) + (3, 4) # Result:
(1, 2, 3, 4)
•Deletion: You can remove a tuple using the del statement: del
myTuple

40
Tuple Types

 ML
– Create with a tuple

val myTuple = (3, 5.8, ′apple′);

– Access as follows:

#1(myTuple);
This reference the first element

– A new tuple type can be defined in ML with a type declaration,


such as the following

type intReal = int * re

Data Types

Creating a Tuple in ML:


•In ML (a functional programming language), a tuple can be created similarly to other
languages, but with syntax specific to ML.
•Here's an example of creating a tuple in ML: val myTuple = (3, 5.8, 'apple');
•This creates a tuple myTuple that contains an integer 3, a floating-point number 5.8,
and a string 'apple’.

Accessing Elements in ML:


•To access an element of a tuple in ML, you can use the # operator, followed by the
index of the element you want to access.
•ML uses 1-based indexing, meaning that #1 will reference the first element of the
tuple.
#1(myTuple); (* This references the first element, which is 3 *)

Defining a New Tuple Type:


•In ML, you can define a new tuple type using a type declaration.
•Here's how you might declare a new tuple type intReal that consists of an integer
and a real (floating-point) number:
type intReal = int * real;
This intReal type can now be used to represent tuples where the first element is an
integer and the second element is a real number.

41
Tuple Types

 F#

let tup = (3, 5, 7);;


let a, b, c = tup;;

This assign 3 to a, 5 to b, and 7 to c

 Type are used in Python, ML, and F# to allow functions to return


multiple values

Data Types

Creating a Tuple in F#:


•In F#, a tuple is created by placing values inside parentheses, separated by commas.
•For example: let tup = (3, 5, 7);;
This creates a tuple tup with the values 3, 5, and 7.

Destructuring a Tuple:
•F# allows you to destructure a tuple directly into individual variables. This means
you can assign each element of the tuple to a separate variable.
•The following code does this: let a, b, c = tup;;
Here, the values of the tuple tup are assigned to the variables a, b, and c in order.
Specifically:
•3 is assigned to a
•5 is assigned to b
•7 is assigned to c

Tuples in Functions:
•In F#, as in Python and ML, tuples are often used to return multiple values from a
function. This allows functions to return more than one value without the need for
complex data structures.
•Example of a function returning a tuple: let addAndMultiply x y = (x + y, x * y)
•This function takes two numbers, adds them, and multiplies them, returning both
results as a tuple.

42
List Types

 Lists were first supported in the first functional programming


language.
 Lists in Common Lisp and Scheme are delimited by parentheses and
the elements are not separated by any punctuation (no commas). For
example

(A B C D)
Nested lists have the same form, so we could have

(A (B C) D)
In this list, (B C) is a list nested inside the outer list

 Data and code have the same form


– As data, (A B C) is literally what it is
– As code, (A B C) is the function A applied to the parameters B and
C

Data Types

1. "Lists were first supported in the first functional programming language."


•The earliest functional programming languages introduced lists as a built-in data
structure.
•Lists became one of the fundamental ways to store and manipulate data in these
languages.
2. "Lists in Common Lisp and Scheme are delimited by parentheses and the
elements are not separated by any punctuation (no commas)."
•In Common Lisp and Scheme (both functional languages), lists are enclosed in
parentheses ().
•Unlike many other programming languages, which use commas (,) to separate
elements in a list, Lisp-style lists do not use commas.
3. "For example (A B C D)"
•This is an example of a list in Lisp-like syntax.
•It contains four elements: A, B, C, and D.
4. "Nested lists have the same form, so we could have (A (B C) D)"
•Lists can contain other lists inside them.
•In this example:
• A and D are individual elements.
• (B C) is a sub-list inside the main list.
5. "In this list, (B C) is a list nested inside the outer list."
•The part (B C) is not treated as separate elements B and C but rather as a single list
inside the main list.
6. "Data and code have the same form."
•In Lisp-like languages, lists can be interpreted in two ways:
• As Data – A list is just a collection of values.
• As Code – A list can also be a function call, where the first element is a
function and the rest are its arguments.

43
7. "As data, (A B C) is literally what it is."
•If treated as data, (A B C) is simply a list with three elements: A,
B, and C.
•It does not execute anything—it just stores the values.
8. "As code, (A B C) is the function A applied to the parameters
B and C."
•If the system treats it as code, then:
• A is assumed to be a function.
• B and C are arguments passed to the function A.
•This means the system will execute A(B, C), just like calling a
function in other programming languages.
Summary

43
List Types

 The interpreter needs to know which a list is, so if it is data, we quote


it with an apostrophe

 List operations in Scheme


– CAR returns the first element of its list parameter

– CDR returns the remainder of its list parameter after the first element
has been removed

Data Types

"The interpreter needs to know which a list is, so if it is data, we quote it with an
apostrophe."
•In Scheme and Lisp, lists can be treated as code (function calls) or data (just a
collection of values).
•To tell the interpreter that a list is data and not code, we use an apostrophe (')
before it
EX: '(A B C)
This tells the interpreter: "Do not treat A as a function; this is just a list of values."
Without the apostrophe, the system might try to execute A as a function.

"CAR returns the first element of its list parameter."


•The CAR function extracts the first element from a list.
•Example: (car '(A B C)) ; Returns A
Given the list '(A B C), CAR returns A.

"CDR returns the remainder of its list parameter after the first element has been
removed."
•The CDR function removes the first element and returns everything else.
•Example: (cdr '(A B C)) ; Returns (B C)
The first element A is removed, leaving (B C).

44
List Types

– CONS puts its first parameter into its second parameter, a list, to make
a new list

– LIST returns a new list of its parameters

Data Types

"CONS puts its first parameter into its second parameter, a list, to make a new list."
•The CONS function takes two arguments:
• A single element (first parameter).
• A list (second parameter).
•It adds the element to the front of the list to form a new list.
•Example: (cons 'A '(B C)) ; Returns (A B C)
A is added to the front of (B C), creating (A B C).

If the second parameter is not a list, CONS creates a pair (dotted pair):
EX: (cons 'A 'B) ; Returns (A . B)
This is not a regular list but a pair (used in association lists).

"LIST returns a new list of its parameters."


•The LIST function takes multiple parameters and creates a list containing all of them.
EX: (list 'A 'B 'C) ; Returns (A B C)
Unlike CONS, LIST does not require an existing list—it simply creates one.

45
List Types

 List operations in ML
– Lists are written in brackets and the elements are separated by
commas, as in the following list of integers:

[5, 7, 9]
– List elements must be of the same type, so the following list would
be illegal:

[5, 7.3, 9] illegal

– The Scheme CONS function is implemented as a binary infix


operator in ML, represented as ::, For example,

3 :: [5, 7, 9] return new list [3, 5, 7, 9]

Data Types

"Lists are written in brackets and the elements are separated by commas, as in the
following list of integers: [5, 7, 9]"
•In ML (MetaLanguage), lists use square brackets [ ] to enclose elements.
•Elements are separated by commas.
•Example of a valid list of integers: [5, 7, 9] (* A list containing 5, 7, and 9 *)

"List elements must be of the same type, so the following list would be illegal: [5,
7.3, 9]"
•ML lists are homogeneous, meaning all elements must be of the same type.
•The example [5, 7.3, 9] is invalid because:
• 5 and 9 are integers (int).
• 7.3 is a floating-point number (real).
• ML does not allow mixing integers and floats in the same list.
•If you need a list of floating-point numbers, all elements must be floats: [5.0, 7.3,
9.0] (* This is a valid list of real numbers *)

"The Scheme CONS function is implemented as a binary infix operator in ML,


represented as ::. For example, 3 :: [5, 7, 9] returns new list [3, 5, 7, 9]"
•:: is the ML equivalent of Scheme's CONS function.
•It adds an element to the front of a list, creating a new list.
•Example: 3 :: [5, 7, 9] (* Returns [3, 5, 7, 9] *)
3 is inserted at the beginning of [5, 7, 9].
The result is a new list [3, 5, 7, 9].

46
List Types

 - ML has functions that correspond to Scheme’s CAR and CDR


functions are named hd (head) and tl (tail), respectively

hd [5, 7, 9] is 5
tl [5, 7, 9] is [7, 9]

Data Types

"ML has functions that correspond to Scheme’s CAR and CDR functions are named
hd (head) and tl (tail), respectively."
•In Scheme,
• CAR returns the first element of a list.
• CDR returns everything except the first element.
•In ML,
• The equivalent of CAR is hd (head).
• The equivalent of CDR is tl (tail).

"hd [5, 7, 9] is 5"


•hd (head) returns the first element of a list.
•Example: hd [5, 7, 9]; (* Output: 5 *)
The first element of [5, 7, 9] is 5.

"tl [5, 7, 9] is [7, 9]"


•tl (tail) returns everything except the first element.
•Example: tl [5, 7, 9]; (* Output: [7, 9] *)
Removes 5 and returns [7, 9].

47
List Types

 F# Lists
– Like those of ML, except elements are separated by semicolons
and hd and tl are methods of the List class

List.hd [1; 3, 5, 7] return 1

Data Types

"F# Lists – Like those of ML, except elements are separated by semicolons and hd
and tl are methods of the List class."
•F# lists are similar to ML lists, but instead of commas, they use semicolons (;) to
separate elements.
•In ML, lists use brackets [ ] and elements are separated by commas (5, 7, 9).
•In F#, lists also use brackets [ ], but elements are separated by semicolons (1; 3; 5;
7).

Example (F# List)


let myList = [1; 3; 5; 7] // Correct F# syntax
In F#, the hd (head) function is a method of the List class.
List.hd returns the first element of the list.

(Note: In F#, the correct function name is List.head, not List.hd.)

48
List Types

 Python Lists
– The list data type also serves as Python’s arrays
– Unlike Scheme, Common Lisp, ML, and F#, Python’s lists are mutable
– Elements can be of any type
– Create a list with an assignment

myList = [3, 5.8, “grape”]

– List elements are referenced with subscripting, with indices beginning at zero

x = myList[1] Assign 5.8 to x

– List elements can be deleted with del


del myList[1]

Data Types

The list data type also serves as Python’s arrays."


•In Python, lists are used like arrays in other programming languages.
•However, Python does not have a separate array type (unlike C or Java).

"Unlike Scheme, Common Lisp, ML, and F#, Python’s lists are mutable."
•Mutable means that Python lists can be changed (elements can be added, removed,
or modified).
•In contrast, lists in Scheme, Common Lisp, ML, and F# are immutable (they cannot
be changed after creation).

myList = [1, 2, 3]
myList[1] = 99 # Changes second element to 99
print(myList) # Output: [1, 99, 3]

"Elements can be of any type."


•Python allows lists to contain different types of data (numbers, strings, objects,
etc.).
•This is different from ML and F#, which require all elements to be of the same
type.

"List elements can be deleted with del."


•You can remove a specific element using the del keyword.

49
Next meeting..

50
Union Types

 A union is a type whose variables are allowed to store different type


values at different times during execution.

Data Types

1. "A union is a type whose variables are allowed to store different type values at
different times during execution."
•A union is a special type of variable that can hold different types of values, but only
one value at a time.
•It shares the same memory location for different data types.
•Used in C, C++, and other low-level languages for memory-efficient programming.

51
Unions in F#

 A union is declared in F# with a type statement using OR operators ( | )

- intReal is the new type


- IntValue and RealValue are constructors

Data Types

Union Types in F#
In F#, a union type allows a variable to hold different types of values at different
times.

Syntax
A union is declared using the type keyword and the OR operator (|) to define
multiple possible

type inReal =
| intValue of int
| realValue of float;;

This defines a new union type called inReal.


The inReal type can hold either:
•An integer value (tagged as intValue).
•A floating-point value (tagged as realValue).

52
Unions in F#

 Values of type intReal can be created using the constructors as if they


were a function, as in the following examples:

let ir1 = IntValue 17;;


let ir2 = RealValue 3.4;;

Data Types

intValue 17 creates an inReal value storing the integer 17.


realValue 3.4 creates an inReal value storing the float 3.4.

53
Unions in F#

 Accessing the value of a union is done with pattern-matching structure

– Pattern can be any data type


– The expression list can have wild cards (_) or be solely a wild card
character.

Data Types

In F#, pattern matching is used to extract values stored in a union type.

54
Unions in F#

 Example:

4, "apple"
Matches only if a = 4 and
b = "apple"
No _
_, “grape"
Skipped
Matches any a if b = Default catch-all if no
"grape" above match is found
Yes -
grape Not reached

Data Types

let a = 7;;
•Declares a variable a with value 7.
let b = "grape";;
•Declares a variable b with value "grape".
let x = match (a, b) with
•Uses pattern matching on the tuple (a, b) which is (7, "grape").

55
Pointer and Reference Types

 A pointer type in which the variables have a range of values that consists
of memory addresses and a special value, nil.

 The value nil is not a valid address and is used to indicate that a pointer
cannot currently be used to reference any memory cell.

 Pointers are designed for two distinct kinds of uses:


– Provide the power of indirect addressing
– Provide a way to manage dynamic memory. A pointer can be used to
access a location in the area where storage is dynamically created
(usually called a heap)

Data Types

Pointer Type:
•A pointer is like a signpost that tells you where to find something in memory (like an
address for a house).
•It can hold two types of values:
• A memory address (the location of some data in your computer's
memory).
• Nil, which means the pointer isn't pointing to anything right now (kind of
like an empty signpost or a broken one).

Nil:
•Nil is just a special value used to show that the pointer isn't currently pointing to any
valid memory location.
•Think of it like a "no address" or "empty" signpost. It tells you that the pointer isn’t
directing you to anything useful.

Why Pointers Are Useful:


•Indirect addressing: Pointers let you access data indirectly. Instead of working
directly with the data, you use the pointer to find it. It's like getting the address of a
house and going there instead of carrying the house itself.
•Dynamic memory management: Pointers are also used to manage memory that is
created while a program is running (called dynamic memory). This memory is usually
allocated in an area called the "heap". Pointers can point to this memory, allowing
the program to create and use memory on the fly.

56
Pointer Operations
 A pointer type usually includes two fundamental pointer operations,
assignment and dereferencing.
 Assignment sets a pointer var’s value to some useful address.
 Dereferencing takes a reference through one level of indirection.
– In C++, dereferencing is explicitly specified with the (*) as a prefix
unary operation.
– If ptr is a pointer var with the value 7080, and the cell whose address
is 7080 has the value 206, then the assignment

Data Types

Pointer Operations: Assignment and Dereferencing


•Assignment: This operation sets the pointer to a specific memory address. It’s like saying,
"Hey pointer, now point to this memory location."
•Dereferencing: This operation is used to access the value stored at the memory address that
the pointer is pointing to. It’s like saying, "Okay, pointer, give me the actual value at that
location, not the address itself.“
2. Assignment
•When you assign a pointer to an address, you are telling the pointer where to "look" in
memory.
•Example: If you have a pointer called ptr, you might assign it the value 7080. This means that
ptr is now pointing to the memory address 7080.
3. Dereferencing
•Dereferencing is when you use the pointer to access the data at the memory address it is
pointing to.
•In C++, dereferencing is done using the * symbol before the pointer.
• For example, if you have ptr which is pointing to address 7080, then *ptr means
look at the value stored at address 7080.
Example:
•Let’s say we have a pointer ptr that holds the value 7080, meaning it points to the memory
address 7080.
•The memory address 7080 contains the value 206.
•If we dereference ptr (using *ptr), we would get the value 206 because that’s what is stored
at address 7080.
In C++:
•Dereferencing: You can dereference a pointer by using the * symbol.
• If ptr = 7080, then *ptr would give you the value 206 stored at memory address
7080.

57
DIAGRAM
j = *ptr;
This assigns the value pointed to by ptr to the variable j.

ptr is a pointer holding the address 7080.


•This address points to an anonymous dynamic variable in
memory.
At memory address 7080, the value stored is 206.
The *ptr operation means:
•Go to the memory address stored in ptr (which is 7080).
•Retrieve the value at that address (which is 206).
Assign that value to variable j.

FINAL RESULT: j is set to 206.

57
Pointer Operations

 In C and C++, there are two ways a pointer to a record can be used to reference a
field in that record.
– If a pointer variable p points to a record with a field name age, (*p).age can be
used to refer to that field.
– The operator ->, when use between a pointer to a struct and a field of that
struct, combines dereferencing and field reference.
– For example, the expression

p -> age is equivalent to (*p).age

Data Types

In C and C++, when you're working with a pointer to a record (like a struct), you can
reference the fields of the struct in two ways.

Using (*p).field
•When you have a pointer p that points to a struct (or record), you can access a field
using (*p).field.
•(*p) dereferences the pointer (so it accesses the struct the pointer is pointing to),
and then .field accesses a specific field within that struct.

If p is a pointer to a struct, and the struct has a field age, you can access it like this:
(*p).age;
This means:
•Dereference the pointer p to access the struct it points to.
•Access the age field in that struct.

Using p->field
•The -> operator is shorthand for dereferencing the pointer and accessing the field in
a single step.
•It combines the dereferencing (*p) and the field reference (.field) into one operation,
making it more concise and easier to read.
Like this: p->age;
This is equivalent to: (*p).age;

58
Pointer Operations
 Languages that provide pointers for the management of a heap must
include an explicit allocation operation.
– Allocation is sometimes specified with a subprogram, such as malloc
in C.
– In a language that support object-oriented programming, allocation
of heap objects is often specified with new operation. C++, which does
not provide implicit deallocation, used delete as its deallocation
operator.

Data Types

Explicit Allocation in Languages with Pointers


•In programming languages that use pointers (like C and C++), you need a way to
allocate memory on the heap (a special area of memory where data is stored
dynamically, during runtime).
•Heap memory is used for objects that don't have a fixed size and need to be created
and managed while the program is running.
•To allocate memory for such objects, languages with pointers typically provide an
explicit allocation operation.

Using malloc in C
•In C, memory is allocated on the heap using the function malloc. The malloc function
stands for memory allocation, and it's used to request a specific amount of memory
during the execution of a program.

Using new in Object-Oriented Languages (C++)


•In object-oriented languages like C++, the new operator is commonly used for
allocating heap memory. This is used when you're working with objects (like
instances of a class) rather than just basic types like integers or floats.

Deallocation: Using delete in C++


•When you're done with the allocated memory (i.e., you no longer need the object or
the data), you have to deallocate it to free up that space in memory.
•C++ requires explicit deallocation, and you do this with the delete operator for single
objects or delete[] for arrays of objects.

59
Pointer Problems
 Dangling Pointers (dangerous)
– A pointer points to a heap-dynamic variable that has been deallocated.
– Dangling pointers are dangerous for the following reasons:
1. The location being pointed to may have been allocated to some new heap-
dynamic variable.
- If the new variable is not the same type as the old one, type checks of uses of
the dangling pointer are invalid.
2. Even if the new one is the same type, its new value will bear no relationship to
the old pointer’s dereferenced value.
3. If the dangling pointer is used to change the heap-dynamic variable, the value of
the heap-dynamic variable will be destroyed.
4. It is possible that the location now is being temporarily used by the storage
management system, possibly as a pointer in a chain of available blocks of
storage, thereby allowing a change to the location to cause the storage manager
to fail.

Data Types

•A pointer is still trying to use memory that was already deleted.


•These pointers can cause problems for several reasons:
1.The deleted memory might now be used by something else.
2.If the new data is a different type, using the old pointer can cause type errors.
3.Even if the type is the same, the data is different and may not make sense anymore.
4.Changing the memory using the old pointer could mess up the new data stored
there.
5.The memory might now be used by the system itself, and changing it could crash
the program.

60
Pointers in C and C++

 Extremely flexible but must be used with care.


 Pointers can point at any variable regardless of when it was allocated
 Used for dynamic storage management and addressing
 Pointer arithmetic is possible in C and C++ makes their pointers more
interesting than those of the other programming languages.
 C and C++ pointers can point at any variable, regales of where it is allocated.
In fact, they can point anywhere in memory, whether there is a variable
there or not, which is one of the dangers of such pointers.
 Explicit dereferencing and address-of operators
 In C and C++, the asterisk (*) denotes the dereferencing operation, and the
ampersand (&) denotes the operator for producing the address of a variable.
For example, in the code

Data Types

1) Pointers are very flexible but must be used carefully.


2) They can point to any variable, no matter when it was created.
3) Pointers are useful for managing memory and finding data locations.
4) In C and C++, you can do math with pointers, which makes them more powerful
than in many other languages.
5) Pointers in C and C++ can point anywhere in memory — even to places where no
variable exists, which is risky and can cause errors.
6) You can get the value from a pointer (dereferencing) and also get the memory
address of a variable using special symbols.
7) In C and C++, * means get the value the pointer points to, and & means get the
address of a variable.

61
Type Checking
 Type checking is the activity of ensuring that the operands of an
operator are of compatible types.
 A compatible type is one that is either legal for the operator, or is
allowed under language rules to be implicitly converted, by compiler-
generated code, to a legal type.
 This automatic conversion is called a coercion.
– Ex: an int variable and a float variable are added in Java, the value of
the int variable is coerced to float and a floating-point is performed.
 A type error is the application of an operator to an operand of an
inappropriate type.
– Ex: in C, if an int value was passed to a function that expected a float
value, a type error would occur (compilers did not check the types of
parameters)

Data Types

1. Type checking makes sure the data types used in operations match or are
allowed.
2. A compatible type is one that works with the operation or can be safely changed
into the correct type by the compiler.
3. This automatic change is called coercion.
1. Example: In Java, if you add an int and a float, the int is automatically
changed to a float, and the result is a float.
4. A type error happens when the wrong type of data is used in an operation.
1. Example: In C, if a function expects a float but gets an int, that’s a type
error — especially in older C compilers that didn’t check for this.

62
Type Checking
 If all type bindings are static, nearly all type checking can be static.
 If type bindings are dynamic, type checking must be dynamic and done
at run-time.
 Some languages, such as JavaScript and PHP, because of their type
binding, allow only dynamic type checking.
 It is better to detect errors at compile time than at run time, because the
earlier correction is usually less costly.
– The penalty for static checking is reduced programmer flexibility.
 Type checking is complicated when a language allows a memory cell to
store values of different types at different time during execution.

Data Types

1. If all data types are fixed before running the program, most type checking can also
be done before running.
2. If data types are decided while the program is running, type checking must also
happen during the run.
3. Languages like JavaScript and PHP do type checking while the program runs
because their types are dynamic.
4. It’s better to find errors before running the program, because fixing them earlier
is easier and cheaper.
1. The downside of early (static) checking is that it gives programmers less
flexibility.
5. Type checking becomes harder when a variable can hold different types at
different times during program execution.

63
Strong Typing
 A programming language is strongly typed if type errors are always
detected. This requires that the types of all operands can be
determined, either at compile time or run time.
 Advantage of strong typing: allows the detection of the misuses of
variables that result in type errors.
 C and C++ are not strongly typed language because both include union
type, which are not type checked.
 Java and C# are strongly typed. Types can be explicitly cast, which would
result in type error. However, there are no implicit ways type errors can
go undetected.
 The coercion rules of a language have an important effect on the value
of type checking.

Data Types

1. A language is strongly typed if it always catches type errors.


2. To do this, the program must know the types of all values, either before or during
the program's run.
3. Benefit: It helps catch mistakes when variables are used incorrectly.
4. C and C++ are not strongly typed because they allow a feature called unions,
which skips type checks.
5. Java and C# are strongly typed. You can force a type change (casting), which might
cause an error, but the system won’t miss type errors on its own.
6. The way a language handles automatic type changes (coercion) affects how well it
checks for type errors.

64
Strong Typing
 Coercion results in a loss of part of the reason of strong typing – error
detection.
– Ex:

 So, the value of strong typing is weakened by coercion


– Languages with a great deal of coercion, like C and C++ are less
reliable than those with no coercion, such as ML and F#.
– Java and C# have half as many assignment type coercions as C++, so
their error detection is better than that of C++, but still not nearly as
effective as that of ML and F#.

Data Types

65
Type Equivalence
 Two types are equivalent if an operand of one type in an expression is
substituted for one of the other type, without coercion.
 There are two approaches to defining type equivalence: name type
equivalence and structure type equivalence.
 Name type equivalence means the two variables have equivalent types if
they are in either the same declaration or in declarations that use the
same type name
– Easy to implement but highly restrictive:
– Subranges of integer types are not equivalent with integer types
– Formal parameters must be the same type as their corresponding
actual parameters
– Ex, Ada

Data Types

1. Two types are the same if you can use one in place of the other without needing to
convert it.
2. There are two ways to decide if types are the same: by name or by structure.
3. Name type equivalence means two variables are the same type only if they were
declared using the same type name or in the same declaration.
1. This method is easy to use in programming but has many limits.
2. If you create a smaller range from an integer (like 1 to 10), it won’t be treated
the same as a regular integer.
3. The type of variables used in function definitions must exactly match the type of
variables passed when calling the function.
1. For example, this strict rule is used in the Ada programming language.
- This code snippet is showing an example of name type equivalence using
the Ada programming language (or similar syntax).

•Indextype is a new type that only allows numbers from 1 to 100.


•count is a regular integer.
•index uses the new type Indextype.
Even though both count and index hold numbers, they are not considered the same type
because:
•One is an Integer.
•The other is a custom type (Indextype), even if it’s based on integers.
Since name type equivalence is being used, types are only considered the same if they have
the exact same name — and these two don’t.
So, you can't just assign count to index or vice versa without a conversion.

66
Type Equivalence
 Structure type equivalence means that two variables have equivalent
types if their types have identical structures
– More flexible, but harder to implement

Data Types

This rule says two variables are the same if their structure (or shape) is the same,
even if their names or ranges are different.
It's more flexible than name type equivalence, but harder to build into a
programming language.

Vector_1 and Vector_2 are both arrays of integers.


They use the same structure (both are arrays of integers), even though:
•They have different names.
•They use different index ranges (1..10 vs 11..20).

The two variables are considered equivalent because their structure is the same —
they’re both arrays of integers.
The range of indexes doesn’t matter here because the array type is unconstrained,
which means its size can vary.

67
Type Equivalence
 C uses both name and structure type equivalence.
– Name type equivalence is used for structure, enumeration, and
union types.
– Other nonscalar types use structure type equivalence. Array type are
equivalence if they the same type components. Also, if an array type
has a constant size, it is equivalent either to other arrays with the
same constant size or to with those without a constant size.
 In languages that do not allow users to define and name types, such as
Fortran and COBOL, names equivalence obviously cannot be used.

Data Types

C uses both name type and structure type equivalence.


→ C checks type similarity in two ways: some mes by name, some mes by structure.

Name type equivalence is used for structure, enumeration, and union types.
→ For struct, enum, and union, two variables are the same type only if they were declared
using the same name.

Other non-scalar types use structure type equivalence.


→ For other complex types (like arrays), C checks if their structure is the same to decide if
they are equivalent.

Array types are equivalent if they have the same type components.
→ Arrays are considered the same if the type of their elements is the same (e.g., both are
arrays of integers).

Also, if an array type has a constant size, it is equivalent either to other arrays with the same
constant size or to those without a constant size.
→ If an array has a fixed size (like 5 elements), it's seen as the same type as other arrays with:
•the same size, or
•no fixed size.

In languages that do not allow users to define and name types, such as Fortran and COBOL,
name equivalence obviously cannot be used.
→ In some older languages like Fortran and COBOL, since you can’t create and name new
types, name-based checking isn’t possible. Only structure-based comparison is used.

68
Theory and Data Types
 Type theory is a broad area of study in mathematics, logic, computer
science, and philosophy
 Two branches of type theory in computer science:
– Practical – The practical branch concerned with data types in
commercial programming languages
– Abstract – The abstract branch primarily focuses on typed lambda
calculus, an area of extensive research by theoretical computer
scientist over the past half century
 A data type defines a set of values and a collection of operations on
those values
 A type system is a set of types and the rules that govern their use in
programs

Data Types

Type theory is a big field studied in math, logic, computer science, and philosophy.
→ Many areas of study explore how types work and how we use them.

There are two main types of type theory in computer science:


•Practical: This deals with the data types used in real-world programming languages.
→ It focuses on things like integers, strings, arrays — what programmers use daily.
•Abstract: This is more theoretical and focuses on typed lambda calculus.
→ It’s a deep area of research about how func ons and types work at a fundamental
level.

A data type is a group of possible values and the actions you can do with them.
→ For example, an integer type includes numbers and allows math opera ons.

A type system is a collection of types and rules for how they are used in a program.
→ It makes sure you use data the right way, like not mixing strings with numbers in
math.

69
Thank you. 

70

You might also like