PL C6
PL C6
DATA TYPES
1
Introduction
Data Types
A data type defines a collection of data values and a set of predefined operations
on those values.
→ A data type is a way to group similar kinds of data (like numbers or le ers) and
decide what actions can be done with them (like adding numbers or combining text).
Computer programs produce results by manipulating data.
→ Programs work by taking data, changing or processing it in some way, and then
producing a result.
ALGOL 68 provided a few basic types and a few flexible structure-defining operators
that allow a programmer to design a data structure for each need.
→ The ALGOL 68 programming language gave programmers some basic data types
and special tools to create custom data structures for different purposes.
A descriptor is the collection of the attributes of a variable.
→ A descriptor is a set of details about a variable, like its type, size, and where it is
stored in memory.
In an implementation, a descriptor is a collection of memory cells that store
variable attributes.
→ When a program runs, a descriptor is stored in memory, holding informa on about
a variable so the computer knows how to use it.
2
Introduction
Data Types
If the attributes are static, descriptors are required only at compile time.
→ If a variable's details (like type and size) do not change, the descriptor is only
needed when the program is being prepared (before it runs).
These descriptors are built by the compiler, usually as a part of the symbol table,
and are used during compilation.
→ The compiler creates these descriptors and stores them in a list (called a symbol
table) to help check and organize variables while the program is being written.
For dynamic attributes, part or all of the descriptor must be maintained during
execution.
→ If a variable's details can change while the program runs, the descriptor must be
kept in memory so the program can update and use it as needed.
Descriptors are used for type checking and by allocation and deallocation
operations.
→ Descriptors help make sure variables are used correctly (type checking) and
manage memory when variables are created or removed.
3
Primitive Data Types
Those not defined in terms of other data types are called primitive data
types.
Almost all programming languages provide a set of primitive data types.
Some primitive data types are merely reflections of the hardware – for
example, most integer types.
The primitive data types of a language are used, along with one or more
type constructors.
Data Types
Those not defined in terms of other data types are called primitive data types.
→ Primi ve data types are basic types that are not built from other types, like
numbers and letters.
Almost all programming languages provide a set of primitive data types.
→ Most programming languages include basic data types like numbers, text, and
true/false values.
Some primitive data types are merely reflections of the hardware – for example,
most integer types.
→ Some basic data types, like whole numbers (integers), exist because computer
hardware is designed to handle them directly.
The primitive data types of a language are used, along with one or more type
constructors.
→ Basic data types can be combined or modified using special tools (type
constructors) to create more complex data structures.
4
Numeric Types
Integer
Floating-point
Complex
Decimal
Data Types
5
Integer
Data Types
6
Floating-point
Model real numbers, but only as approximations for most real values.
On most computers, floating-point numbers are stored in binary, which
exacerbates the problem.
Another problem is the loss of accuracy through arithmetic operations.
Languages for scientific use support at least two floating-point types;
sometimes more (e.g. float, and double.)
The collection of values that can be represented by a floating-point type is
defined in terms of precision and range.
Precision: is the accuracy of the fractional part of a value, measured
as the number of bits. Figure below shows single and double
precision.
Range: is the range of fractions and exponents.
Data Types
Floating point
→ A floa ng-point number is a number that can have decimals, like 3.14 or -2.5.
Model real numbers, but only as approximations for most real values.
→ Floa ng-point numbers represent real numbers, but they are not always exact due to how
computers store them.
On most computers, floating-point numbers are stored in binary, which exacerbates the
problem.
→ Computers store floa ng-point numbers in binary (1s and 0s), which can cause small
errors in calculations.
Another problem is the loss of accuracy through arithmetic operations.
→ When doing math with floa ng-point numbers, small errors can build up, making the
result slightly inaccurate.
Languages for scientific use support at least two floating-point types; sometimes more
(e.g., float and double).
→ Programming languages used for science and engineering usually offer at least two types
of floating-point numbers, like float (less precise) and double (more precise).
The collection of values that can be represented by a floating-point type is defined in terms
of precision and range.
→ Floa ng-point numbers have limits on how accurately they store decimal values
(precision) and how big or small the numbers can be (range).
Precision: is the accuracy of the fractional part of a value, measured as the number of bits.
Figure below shows single and double precision.
→ Precision refers to how many decimal places a floa ng-point number can store, depending
on how many bits are used (e.g., single-precision uses fewer bits than double-precision).
Range: is the range of fractions and exponents.
→ Range refers to how large or small the floa ng-point number can be, including both the
decimal part and the exponent (scientific notation).
7
Floating-point
Data Types
This image illustrates the IEEE 754 standard for floating-point representation, which
is used in computer systems to represent real numbers. It shows two formats:
8
Complex
(7 + 3j)
Data Types
They are useful in engineering, physics, and computer science, especially for signal processing
and electrical circuits.
Python and some other programming languages (like Fortran) have built-in support for
complex numbers, allowing mathematical operations like addition, subtraction,
multiplication, and division.
9
Decimal
Data Types
Most larger computers that are designed to support business applications have
hardware support for decimal data types.
→ Big computers used in business applica ons o en have special hardware that
helps them handle decimal numbers more easily and accurately.
Decimal types store a fixed number of decimal digits, with the decimal point at a
fixed position in the value.
→ Decimal numbers are stored with a specific number of digits, and the decimal
point (like 3.14) is always in the same place.
These are the primary data types for business data processing and are therefore
essential to COBOL.
→ Decimal numbers are the main types of data used in business calcula ons, so they
are very important for COBOL, a programming language used for business
applications.
Advantage: accuracy of decimal values.
→ One benefit of using decimal types is that they provide exact accuracy for decimal
numbers, avoiding rounding errors.
Disadvantages: limited range since no exponents are allowed, and its
representation wastes memory.
→ The downside of decimal types is that they can only represent a limited range of
numbers because they don’t use scientific notation (exponents), and they can use up
more memory than necessary.
10
Boolean Types
Introduced by ALGOL 60
They are used to represent switched and flags in programs
The use of Booleans enhances readability
Range of values: two elements, one for “true” and one for “false”
One popular exception is C89, in which numeric expressions are used as
conditionals. In such expressions, all operands with nonzero values are
considered true, and zero is considered false
A Boolean value could be represented by a single bit, but often statured in
the smallest efficiently addressable cell of memory, typically a byte
Data Types
1.Introduced by ALGOL 60
→ Boolean types were first introduced in the programming language ALGOL 60.
2.They are used to represent switches and flags in programs
→ Boolean values are commonly used to indicate whether something is on/off or to
represent conditions like "yes/no" in programs.
3.The use of Booleans enhances readability
→ Using Booleans makes code easier to understand because they clearly express true
or false conditions.
4.Range of values: two elements, one for “true” and one for “false”
→ A Boolean type can only have two values: true (usually represented as 1) or false
(usually represented as 0).
5.One popular exception is C89, in which numeric expressions are used as
conditionals. In such expressions, all operands with nonzero values are considered
true, and zero is considered false
→ In C89, instead of using true or false directly, numbers are used for condi ons. Any
number other than zero is treated as true, and zero is treated as false.
6.A Boolean value could be represented by a single bit, but often stored in the
smallest efficiently addressable cell of memory, typically a byte
→ While a Boolean could technically be stored as just one bit (0 or 1), it is o en
stored in a byte (8 bits) because that’s the smallest unit of memory that a computer
can easily work with.
4o mini
11
Character Types
Data Types
12
Character String Types
Data Types
13
Design Issues
Data Types
14
String and Their Operations
Typical operations:
– Assignment
– Comparison (=, >, etc.)
– Catenation
– Substring reference
– Pattern matching
C and C++ use char arrays to store char strings and provide a collection of
string operations through a standard library whose header is string.h
Character string are terminated with a special character, null, with is
represented with zero
Data Types
Typical operations:
→ Common ac ons that can be performed on strings include:
– Assignment
→ Assigning a value (like a word or sentence) to a string variable.
– Comparison (=, >, etc.)
→ Comparing strings to check if they are equal, greater than, or less than each other.
– Catenation
→ Combining (or concatena ng) two strings to form a longer string (e.g., "Hello" +
"World" = "HelloWorld").
– Substring reference
→ Accessing a part (substring) of a string, like ge ng the first few characters of
"Hello" (e.g., "Hel").
– Pattern matching
→ Checking if a string matches a specific pa ern, like searching for the word "cat" in
a sentence.
C and C++ use char arrays to store char strings and provide a collection of string
operations through a standard library whose header is string.h
→ In C and C++, strings are stored as arrays of characters, and there is a library (called
string.h) that includes many functions to manipulate strings.
Character strings are terminated with a special character, null, which is represented
with zero
→ In C and C++, strings are marked by a special character called "null" ('\0'), which
signals the end of the string.
15
String and Their Operations
Data Types
16
String and Their Operations
Some of the most commonly used library functions for character strings in
C and C++ are
– strcpy: copy strings
– strcat: catenates on given string onto another
– strcmp:lexicographically compares (the order of their codes) two
strings
– strlen: returns the number of characters, not counting the null
In Java, strings are supported by String class, whose value are constant
string, and the StringBuffer class whose value are changeable and are more
like arrays of single characters
C# and Ruby include string classes that are similar to those of Java
Python strings are immutable, similar to the String class objects of Java
Data Types
Some of the most commonly used library functions for character strings in C and C++ are
→ Here are some of the most frequently used func ons in C and C++ for working with
character strings:
– strcpy: copy strings
→ The strcpy function copies the contents of one string into another.
– strcat: catenates on given string onto another
→ The strcat function concatenates (or joins) one string to the end of another string.
– strcmp: lexicographically compares (the order of their codes) two strings
→ The strcmp function compares two strings based on the alphabetical order of their
characters, following the ASCII or Unicode values.
– strlen: returns the number of characters, not counting the null
→ The strlen function returns the length of a string by counting the number of characters,
but it doesn’t count the null character ('\0').
In Java, strings are supported by the String class, whose values are constant strings, and
the StringBuffer class whose values are changeable and are more like arrays of single
characters
→ In Java, strings are handled by the String class, where strings cannot be changed once
created, and by the StringBuffer class, which allows modification of the string, like an array of
characters.
C# and Ruby include string classes that are similar to those of Java
→ Both C# and Ruby have string classes similar to Java's, where strings can be handled in a
similar way, with immutable strings or mutable ones.
Python strings are immutable, similar to the String class objects of Java
→ In Python, strings cannot be changed a er they are created, just like Java's String class
objects, which are also immutable.
17
String Length Options
Static Length String: The length can be static and set when the string is
created. This is the choice for the immutable objects of Java’s String class
as well as similar classes in the C++ standard class library and the .NET class
library available to C# and F#
Data Types
String length options refer to how the size of a string is determined and whether it
can change during the program's execution. The way a string's length is handled
affects how memory is allocated and how flexible the string is in different
programming languages. Here are three common approaches:
18
Evaluation
Aid to writability
Data Types
Evaluation:
1.Aid to writability
→ String types help make programming easier (writability) by providing a way to
handle sequences of characters, which are essential in many programs.
2.As a primitive type with static length, they are inexpensive to provide--why not
have them?
→ If strings have a fixed length (sta c), they are simple and cheap to implement, so
there’s little reason not to use them in programs.
3.Dynamic length is nice, but is it worth the expense?
→ While strings with dynamic length (which can change in size) offer more flexibility,
they come at the cost of additional memory management and processing, so it's
important to weigh whether the benefits justify the extra expense.
19
Implementation of Character String Types
Data Types
The compile-time descriptor for static strings shown in Figure 6.2 consists of three
fields:
1.Static string (Type Name) – Represents the name of the string type, indicating that
it is a static string.
2.Length – Stores the number of characters in the string (i.e., the size of the string).
3.Address – Contains the memory location of the first character of the string.
20
Implementation of Character String Types
Limited Dynamic Length Strings - may need a run-time descriptor for length
to store both the fixed maximum length and the current length (but not in
C and C++ because the end of a string is marked with the null character)
Data Types
These strings can change in length at runtime but have a fixed maximum length.
A run-time descriptor is needed to store:
1.Maximum length – The upper limit of how long the string can be.
2.Current length – The actual length of the string at any given time.
21
Implementation of Character String Types
1. Strings can be stored in a linked list, so that when a string grows, the newly
required cells can come from anywhere in the heap
• The drawbacks to this method are the extra storage occupied by the
links in the list representation and necessary complexity of string
operations0
• String operations are slowed by the required pointer chasing
Data Types
22
Implementation of Character String Types
Data Types
23
Enumeration Types
The enumeration constants are typically implicitly assigned the integer values,
0, 1, …, but can explicitly assigned any integer literal in the type’s definition
Data Types
Enumeration Types
Enumeration types are a way to define a set of named constants that represent
specific values. Here’s an explanation of the concept:
•All possible values, which are named constants, are provided, or enumerated, in
the definition
→ In an enumera on type, you define a list of possible values that a variable of this
type can have. Each of these values is given a name (constant), making the code more
readable.
•Enumeration types provide a way of defining and grouping collections of named
constants, which are called enumeration constants
→ Enumera ons allow you to group related constant values together under a single
type, called enumeration constants. This helps organize and simplify code.
24
Designs
Data Types
Designs
In some programming languages, where enumeration types are not available,
programmers use integer values to simulate them. Here's how it's done:
•In languages that do not have enumeration types, programmers usually simulate
them with integer values
→ If a programming language doesn’t support enumera on types, developers o en
use regular integer variables to represent the different values of an enumeration,
giving each value a unique integer.
•For example, C did not have an enumeration type. We might use 0 to represent
blue, 1 to represent red, and so forth. These values could be defined as follows:
→ In the C language, which didn’t have built-in enumeration types, developers could
use integer values to represent different colors. Here, 0 would represent blue and 1
would represent red.
25
Designs
The colors type uses the default internal values for the enumeration constants,
0, 1, …, although the constants could have been assigned any integer literal.
Data Types
26
Designs
Data Types
27
Array Types
Data Types
Array Types
•An array is a homogeneous aggregate of data elements in which an individual
element is identified by its position in the aggregate, relative to the first element.
→ An array is a collec on of elements that are all of the same type (homogeneous).
Each element in the array can be accessed by its position, or index, starting from the
first element.
•The individual data elements of an array are of the same type.
→ All elements in an array must be of the same data type, like integers, floats, or
strings. For example, an array of integers can only hold integer values.
•References to individual array elements are specified using subscription
expressions.
→ To access an element in the array, you use a subscrip on expression, which is
typically written using square brackets [] containing the index number. For example,
arr[0] refers to the first element in the array arr.
•If any of the subscript expressions in a reference include variables, then the
reference will require an additional run-time calculation to determine the address
of the memory location being referenced.
→ When the index used to access an array element is stored in a variable, the
program must compute the memory address of the element at that index during
runtime. This makes the access slightly slower, as it requires additional calculations to
find the correct memory location.
28
Design Issues
Data Types
Design Issues
The primary design issues specific to arrays are the following:
•What types are legal for subscripts?
→ This ques on addresses what types of values can be used to
access elements in an array. For example, can you use an integer,
floating point, or string as the index, or must it be a specific type like
integers?
•Are subscripting expressions in element references range
checked?
→ This refers to whether the program checks if the index used to
access an array element is within valid bounds (e.g., checking if the
index is not less than 0 or greater than the array length). If range
checking is enabled, the program will throw an error if an invalid
index is accessed.
•When are subscript ranges bound?
→ This addresses when the size or range of indices that can be used
for accessing array elements is determined. It could be done at
compile-time (before the program runs) or at runtime (while the
program is running), affecting how flexible or efficient the array is.
•When does allocation take place?
→ This refers to when the memory for the array is actually allocated.
It could be allocated when the program is compiled, or dynamically
29
during execution. This affects how arrays are managed
and how memory is used.
•Are ragged or rectangular multidimensional arrays
allowed, or both?
→ This addresses whether mul dimensional arrays (like
matrices) can have rows with varying lengths (ragged
arrays) or must all have the same length in each
dimension (rectangular arrays). Some languages support
only rectangular arrays, while others allow ragged arrays
as well.
•Can arrays be initialized when they have their storage
allocated?
→ This ques on asks if you can assign ini al values to an
array when it is created, or if the array will be empty
until later assignments. Some languages allow
initialization at the time of allocation, while others may
not.
•What kinds of slices are allowed, if any?
→ A "slice" refers to a por on of an array (such as a
subarray or a subset of elements). This question
concerns whether languages support extracting and
working with portions of arrays, and how flexible or
complex these slices can be (e.g., specifying ranges of
indices, skipping elements, etc.).
29
Arrays and Indices
Because ( ) are used for both subprogram parameters and array subscripts in
Ada, this results in reduced readability.
Data Types
30
→ Sum := Sum + B(I);
This is an example of indexing in Ada. The value of the
array element B(I) at index I is added to Sum, and the
result is stored back in Sum.
•Because ( ) are used for both subprogram parameters
and array subscripts in Ada, this results in reduced
readability:
→ In Ada, parentheses are used for both
function/subprogram parameters and array subscripts,
which can sometimes confuse readers, as it is unclear
whether the parentheses are referring to an array index
or a function call. This can affect the readability of code,
especially when it is not immediately obvious whether
something is an array access or a function call.
30
Arrays and Indices
Data Types
31
Array Initialization
The array will have 8 elements because all strings are terminated with a null
character(zero), which is implicitly supplied by system for string constants.
Data Types
•Example in C:
int list[] = {4, 5, 7, 83};
•Here, the array list is initialized with values 4, 5, 7, and 83.
•The compiler automatically sets the array length based on the number of values provided.
In this case, list will have 4 elements.
Character Strings in C & C++:
•In C and C++, strings are stored as arrays of char.
•For example:
char name[] = "Quellos";
In summary, array initialization allows you to set values when you create the array, and in
some cases, like strings, the system adds special characters like the null character to mark the
end.
32
Array Initialization
Arrays of strings in C and C++ can also be initialized with string literals. For
example,
Data Types
Example
char *names[] = {"Quellos", "Carlos", "Jr"};
Here, names is an array of pointers to strings (character arrays).
Each element in the array points to a string literal (like
"Quellos", "Carlos", and "Jr").
This means names[0] points to "Quellos", names[1] points to
"Carlos", and names[2] points to "Jr".
ava:
•In Java, you can use a similar syntax to initialize an array of
33
String objects.
•Example:
String[] names = {"Quellos", "Carlos", "Jr"};
In this case, names is an array of references to String objects.
Each element in the array holds a reference to a String object,
and the array is initialized with string literals ("Quellos",
"Carlos", and "Jr").
33
Array Operations
Data Types
34
dynamic, meaning their size can change
(elements can be added or removed), similar to
dynamic arrays in other languages.
5. "Because the objects can be of any types, these
arrays are heterogeneous."
1. Python's lists can contain elements of different
types (e.g., integers, strings, or other objects),
which makes them heterogeneous (as opposed
to arrays in other languages, which are typically
homogeneous, meaning all elements are of the
same type).
6."Python’s array assignments, but they are only
reference changes."
1. When you assign one list to another in Python,
you're not copying the elements of the list.
Instead, both variables point to the same list in
memory. This is called reference assignment,
meaning the list is not duplicated, but rather the
reference to the list is assigned to the new
variable.
7."Python also supports array catenation and element
membership operations."
1. In Python, you can concatenate (combine) lists
using the + operator, and you can also check if
an element is in a list using the in keyword
(membership test).
8."Ruby also provides array catenation."
1. Ruby also supports array concatenation,
34
allowing you to join arrays together using
methods or operators (like +).
9. "APL provides the most powerful array processing
operations for vectors and matrices as well as unary
operators (for example, to reverse column elements)."
1. APL (A Programming Language) is a language
known for its powerful array manipulation
capabilities. It supports operations for handling
vectors (one-dimensional arrays) and matrices
(two-dimensional arrays), and it includes special
operators (like unary operators) that allow for
advanced operations, such as reversing columns
in a matrix.
34
Associative Arrays
%salaries = (“Gary” => 75000, “Perry” => 57000, “Mary” => 55750, “Cedric” =>
47850);
Data Types
35
"Associative arrays are supported by the standard
class libraries of Java, C++, C#, and F#."
•Modern programming languages like Java, C++,
C#, and F# provide built-in support for associative
arrays, typically through their standard libraries.
These languages have specialized data structures
(such as HashMaps or Dictionaries) that implement
associative arrays.
"Example: In Perl, associative arrays are often
called hashes. Names begin with %; literals are
delimited by parentheses."
•In Perl, associative arrays are referred to as
hashes. A hash is identified by a percent sign (%) at
the beginning of its name. When writing hash
literals (literal key-value pairs), they are enclosed in
parentheses.
35
respective values (salaries). In this example, the
%salaries hash contains key-value pairs where the
keys are the names ("Gary", "Perry", etc.) and the
values are their salaries (75000, 57000, etc.).
35
Associative Arrays
$salaries{“Perry”} = 58850;
delete $salaries{“Gary”};
@salaries = ();
Data Types
36
•To empty an associative array (or a list) in Perl, you can assign
it an empty list (). This clears all elements in the @salaries
array. Note that this applies to arrays, and if you wanted to
empty a hash, you would use the same method for hashes (e.g.,
%salaries = ();).
36
Associative Arrays
A Lua table is an associate array in which both the keys and the values
can by any type.
Data Types
"Python’s associative arrays, which are called dictionaries, are similar to those of
Perl, except the values are all reference to objects."
•In Python, associative arrays are called dictionaries. They are similar to Perl's
hashes, as both store key-value pairs. However, in Python, all values in the dictionary
are references to objects. This means that when you assign a value to a key, the value
refers to an object, and changes to the object will be reflected wherever that
reference is used.
"PHP’s arrays are both normal arrays and associative array."
•In PHP, arrays are flexible and can act as both normal arrays (indexed by integers)
and associative arrays (indexed by strings or other types of keys). This means PHP
arrays can behave in different ways depending on how they are used, making them
very versatile.
"A Lua table is an associative array in which both the keys and the values can be
any type."
•In Lua, the primary data structure for associative arrays is called a table. Unlike in
some other languages, in Lua, both keys and values in a table can be of any type
(e.g., strings, numbers, functions, or even other tables). This provides great flexibility
in how tables can be used.
"C# and F# support associative arrays through a .NET class."
•Both C# and F#, which are part of the .NET framework, support associative arrays
through the use of a predefined class in .NET. In C#, for example, the
Dictionary<TKey, TValue> class is used to store key-value pairs, where the keys and
values can be of any type.
37
Record Types
In C, C++, and C#, records are supported with the struct data type. In
C++, structures are a minor variation on classes.
Data Types
38
Definitions of Records
struct Person {
char name[50];
int age;
};
Data Types
39
Tuple Types
A tuple is a data type that is similar to a record, except that the elements are not named
Python
– Closely related to its lists, but tuples are immutable
– If a tuple needs to be changed, it can be converted to an array with the list function
– Create with a tuple literal
myTuple = (3, 5.8, ′apple′)
– Note that the elements of a tuple need not be of the same type
– The elements of a tuple can be referenced with indexing in brackets, as in the
following:
myTuple[1]
This references the first element of the tuple, because tuple indexing begins at 1
– Tuple can be catenated with the plus (+) operator – They can be deleted with del
statement
newTuple = (1, 2) + (3, 4) # Result: (1, 2, 3, 4)
Data Types
40
Mixed Types in Tuples:
•Unlike arrays, the elements of a tuple need not be of the same
type. For example, a tuple can contain integers, strings, floats,
etc., all together in one tuple.
Accessing Elements in a Tuple:
•The elements of a tuple can be accessed using indexing in
brackets. However, tuple indexing in Python starts from 0, not
1. So: myTuple[0] # This references the first element, which is 3
in this case.
Tuple Operations:
•Concatenation: Tuples can be concatenated (combined) using
the + operator. For example: newTuple = (1, 2) + (3, 4) # Result:
(1, 2, 3, 4)
•Deletion: You can remove a tuple using the del statement: del
myTuple
40
Tuple Types
ML
– Create with a tuple
– Access as follows:
#1(myTuple);
This reference the first element
Data Types
41
Tuple Types
F#
Data Types
Destructuring a Tuple:
•F# allows you to destructure a tuple directly into individual variables. This means
you can assign each element of the tuple to a separate variable.
•The following code does this: let a, b, c = tup;;
Here, the values of the tuple tup are assigned to the variables a, b, and c in order.
Specifically:
•3 is assigned to a
•5 is assigned to b
•7 is assigned to c
Tuples in Functions:
•In F#, as in Python and ML, tuples are often used to return multiple values from a
function. This allows functions to return more than one value without the need for
complex data structures.
•Example of a function returning a tuple: let addAndMultiply x y = (x + y, x * y)
•This function takes two numbers, adds them, and multiplies them, returning both
results as a tuple.
42
List Types
(A B C D)
Nested lists have the same form, so we could have
(A (B C) D)
In this list, (B C) is a list nested inside the outer list
Data Types
43
7. "As data, (A B C) is literally what it is."
•If treated as data, (A B C) is simply a list with three elements: A,
B, and C.
•It does not execute anything—it just stores the values.
8. "As code, (A B C) is the function A applied to the parameters
B and C."
•If the system treats it as code, then:
• A is assumed to be a function.
• B and C are arguments passed to the function A.
•This means the system will execute A(B, C), just like calling a
function in other programming languages.
Summary
43
List Types
– CDR returns the remainder of its list parameter after the first element
has been removed
Data Types
"The interpreter needs to know which a list is, so if it is data, we quote it with an
apostrophe."
•In Scheme and Lisp, lists can be treated as code (function calls) or data (just a
collection of values).
•To tell the interpreter that a list is data and not code, we use an apostrophe (')
before it
EX: '(A B C)
This tells the interpreter: "Do not treat A as a function; this is just a list of values."
Without the apostrophe, the system might try to execute A as a function.
"CDR returns the remainder of its list parameter after the first element has been
removed."
•The CDR function removes the first element and returns everything else.
•Example: (cdr '(A B C)) ; Returns (B C)
The first element A is removed, leaving (B C).
44
List Types
– CONS puts its first parameter into its second parameter, a list, to make
a new list
Data Types
"CONS puts its first parameter into its second parameter, a list, to make a new list."
•The CONS function takes two arguments:
• A single element (first parameter).
• A list (second parameter).
•It adds the element to the front of the list to form a new list.
•Example: (cons 'A '(B C)) ; Returns (A B C)
A is added to the front of (B C), creating (A B C).
If the second parameter is not a list, CONS creates a pair (dotted pair):
EX: (cons 'A 'B) ; Returns (A . B)
This is not a regular list but a pair (used in association lists).
45
List Types
List operations in ML
– Lists are written in brackets and the elements are separated by
commas, as in the following list of integers:
[5, 7, 9]
– List elements must be of the same type, so the following list would
be illegal:
Data Types
"Lists are written in brackets and the elements are separated by commas, as in the
following list of integers: [5, 7, 9]"
•In ML (MetaLanguage), lists use square brackets [ ] to enclose elements.
•Elements are separated by commas.
•Example of a valid list of integers: [5, 7, 9] (* A list containing 5, 7, and 9 *)
"List elements must be of the same type, so the following list would be illegal: [5,
7.3, 9]"
•ML lists are homogeneous, meaning all elements must be of the same type.
•The example [5, 7.3, 9] is invalid because:
• 5 and 9 are integers (int).
• 7.3 is a floating-point number (real).
• ML does not allow mixing integers and floats in the same list.
•If you need a list of floating-point numbers, all elements must be floats: [5.0, 7.3,
9.0] (* This is a valid list of real numbers *)
46
List Types
hd [5, 7, 9] is 5
tl [5, 7, 9] is [7, 9]
Data Types
"ML has functions that correspond to Scheme’s CAR and CDR functions are named
hd (head) and tl (tail), respectively."
•In Scheme,
• CAR returns the first element of a list.
• CDR returns everything except the first element.
•In ML,
• The equivalent of CAR is hd (head).
• The equivalent of CDR is tl (tail).
47
List Types
F# Lists
– Like those of ML, except elements are separated by semicolons
and hd and tl are methods of the List class
Data Types
"F# Lists – Like those of ML, except elements are separated by semicolons and hd
and tl are methods of the List class."
•F# lists are similar to ML lists, but instead of commas, they use semicolons (;) to
separate elements.
•In ML, lists use brackets [ ] and elements are separated by commas (5, 7, 9).
•In F#, lists also use brackets [ ], but elements are separated by semicolons (1; 3; 5;
7).
48
List Types
Python Lists
– The list data type also serves as Python’s arrays
– Unlike Scheme, Common Lisp, ML, and F#, Python’s lists are mutable
– Elements can be of any type
– Create a list with an assignment
– List elements are referenced with subscripting, with indices beginning at zero
Data Types
"Unlike Scheme, Common Lisp, ML, and F#, Python’s lists are mutable."
•Mutable means that Python lists can be changed (elements can be added, removed,
or modified).
•In contrast, lists in Scheme, Common Lisp, ML, and F# are immutable (they cannot
be changed after creation).
myList = [1, 2, 3]
myList[1] = 99 # Changes second element to 99
print(myList) # Output: [1, 99, 3]
49
Next meeting..
50
Union Types
Data Types
1. "A union is a type whose variables are allowed to store different type values at
different times during execution."
•A union is a special type of variable that can hold different types of values, but only
one value at a time.
•It shares the same memory location for different data types.
•Used in C, C++, and other low-level languages for memory-efficient programming.
51
Unions in F#
Data Types
Union Types in F#
In F#, a union type allows a variable to hold different types of values at different
times.
Syntax
A union is declared using the type keyword and the OR operator (|) to define
multiple possible
type inReal =
| intValue of int
| realValue of float;;
52
Unions in F#
Data Types
53
Unions in F#
Data Types
54
Unions in F#
Example:
4, "apple"
Matches only if a = 4 and
b = "apple"
No _
_, “grape"
Skipped
Matches any a if b = Default catch-all if no
"grape" above match is found
Yes -
grape Not reached
Data Types
let a = 7;;
•Declares a variable a with value 7.
let b = "grape";;
•Declares a variable b with value "grape".
let x = match (a, b) with
•Uses pattern matching on the tuple (a, b) which is (7, "grape").
55
Pointer and Reference Types
A pointer type in which the variables have a range of values that consists
of memory addresses and a special value, nil.
The value nil is not a valid address and is used to indicate that a pointer
cannot currently be used to reference any memory cell.
Data Types
Pointer Type:
•A pointer is like a signpost that tells you where to find something in memory (like an
address for a house).
•It can hold two types of values:
• A memory address (the location of some data in your computer's
memory).
• Nil, which means the pointer isn't pointing to anything right now (kind of
like an empty signpost or a broken one).
Nil:
•Nil is just a special value used to show that the pointer isn't currently pointing to any
valid memory location.
•Think of it like a "no address" or "empty" signpost. It tells you that the pointer isn’t
directing you to anything useful.
56
Pointer Operations
A pointer type usually includes two fundamental pointer operations,
assignment and dereferencing.
Assignment sets a pointer var’s value to some useful address.
Dereferencing takes a reference through one level of indirection.
– In C++, dereferencing is explicitly specified with the (*) as a prefix
unary operation.
– If ptr is a pointer var with the value 7080, and the cell whose address
is 7080 has the value 206, then the assignment
Data Types
57
DIAGRAM
j = *ptr;
This assigns the value pointed to by ptr to the variable j.
57
Pointer Operations
In C and C++, there are two ways a pointer to a record can be used to reference a
field in that record.
– If a pointer variable p points to a record with a field name age, (*p).age can be
used to refer to that field.
– The operator ->, when use between a pointer to a struct and a field of that
struct, combines dereferencing and field reference.
– For example, the expression
Data Types
In C and C++, when you're working with a pointer to a record (like a struct), you can
reference the fields of the struct in two ways.
Using (*p).field
•When you have a pointer p that points to a struct (or record), you can access a field
using (*p).field.
•(*p) dereferences the pointer (so it accesses the struct the pointer is pointing to),
and then .field accesses a specific field within that struct.
If p is a pointer to a struct, and the struct has a field age, you can access it like this:
(*p).age;
This means:
•Dereference the pointer p to access the struct it points to.
•Access the age field in that struct.
Using p->field
•The -> operator is shorthand for dereferencing the pointer and accessing the field in
a single step.
•It combines the dereferencing (*p) and the field reference (.field) into one operation,
making it more concise and easier to read.
Like this: p->age;
This is equivalent to: (*p).age;
58
Pointer Operations
Languages that provide pointers for the management of a heap must
include an explicit allocation operation.
– Allocation is sometimes specified with a subprogram, such as malloc
in C.
– In a language that support object-oriented programming, allocation
of heap objects is often specified with new operation. C++, which does
not provide implicit deallocation, used delete as its deallocation
operator.
Data Types
Using malloc in C
•In C, memory is allocated on the heap using the function malloc. The malloc function
stands for memory allocation, and it's used to request a specific amount of memory
during the execution of a program.
59
Pointer Problems
Dangling Pointers (dangerous)
– A pointer points to a heap-dynamic variable that has been deallocated.
– Dangling pointers are dangerous for the following reasons:
1. The location being pointed to may have been allocated to some new heap-
dynamic variable.
- If the new variable is not the same type as the old one, type checks of uses of
the dangling pointer are invalid.
2. Even if the new one is the same type, its new value will bear no relationship to
the old pointer’s dereferenced value.
3. If the dangling pointer is used to change the heap-dynamic variable, the value of
the heap-dynamic variable will be destroyed.
4. It is possible that the location now is being temporarily used by the storage
management system, possibly as a pointer in a chain of available blocks of
storage, thereby allowing a change to the location to cause the storage manager
to fail.
Data Types
60
Pointers in C and C++
Data Types
61
Type Checking
Type checking is the activity of ensuring that the operands of an
operator are of compatible types.
A compatible type is one that is either legal for the operator, or is
allowed under language rules to be implicitly converted, by compiler-
generated code, to a legal type.
This automatic conversion is called a coercion.
– Ex: an int variable and a float variable are added in Java, the value of
the int variable is coerced to float and a floating-point is performed.
A type error is the application of an operator to an operand of an
inappropriate type.
– Ex: in C, if an int value was passed to a function that expected a float
value, a type error would occur (compilers did not check the types of
parameters)
Data Types
1. Type checking makes sure the data types used in operations match or are
allowed.
2. A compatible type is one that works with the operation or can be safely changed
into the correct type by the compiler.
3. This automatic change is called coercion.
1. Example: In Java, if you add an int and a float, the int is automatically
changed to a float, and the result is a float.
4. A type error happens when the wrong type of data is used in an operation.
1. Example: In C, if a function expects a float but gets an int, that’s a type
error — especially in older C compilers that didn’t check for this.
62
Type Checking
If all type bindings are static, nearly all type checking can be static.
If type bindings are dynamic, type checking must be dynamic and done
at run-time.
Some languages, such as JavaScript and PHP, because of their type
binding, allow only dynamic type checking.
It is better to detect errors at compile time than at run time, because the
earlier correction is usually less costly.
– The penalty for static checking is reduced programmer flexibility.
Type checking is complicated when a language allows a memory cell to
store values of different types at different time during execution.
Data Types
1. If all data types are fixed before running the program, most type checking can also
be done before running.
2. If data types are decided while the program is running, type checking must also
happen during the run.
3. Languages like JavaScript and PHP do type checking while the program runs
because their types are dynamic.
4. It’s better to find errors before running the program, because fixing them earlier
is easier and cheaper.
1. The downside of early (static) checking is that it gives programmers less
flexibility.
5. Type checking becomes harder when a variable can hold different types at
different times during program execution.
63
Strong Typing
A programming language is strongly typed if type errors are always
detected. This requires that the types of all operands can be
determined, either at compile time or run time.
Advantage of strong typing: allows the detection of the misuses of
variables that result in type errors.
C and C++ are not strongly typed language because both include union
type, which are not type checked.
Java and C# are strongly typed. Types can be explicitly cast, which would
result in type error. However, there are no implicit ways type errors can
go undetected.
The coercion rules of a language have an important effect on the value
of type checking.
Data Types
64
Strong Typing
Coercion results in a loss of part of the reason of strong typing – error
detection.
– Ex:
Data Types
65
Type Equivalence
Two types are equivalent if an operand of one type in an expression is
substituted for one of the other type, without coercion.
There are two approaches to defining type equivalence: name type
equivalence and structure type equivalence.
Name type equivalence means the two variables have equivalent types if
they are in either the same declaration or in declarations that use the
same type name
– Easy to implement but highly restrictive:
– Subranges of integer types are not equivalent with integer types
– Formal parameters must be the same type as their corresponding
actual parameters
– Ex, Ada
Data Types
1. Two types are the same if you can use one in place of the other without needing to
convert it.
2. There are two ways to decide if types are the same: by name or by structure.
3. Name type equivalence means two variables are the same type only if they were
declared using the same type name or in the same declaration.
1. This method is easy to use in programming but has many limits.
2. If you create a smaller range from an integer (like 1 to 10), it won’t be treated
the same as a regular integer.
3. The type of variables used in function definitions must exactly match the type of
variables passed when calling the function.
1. For example, this strict rule is used in the Ada programming language.
- This code snippet is showing an example of name type equivalence using
the Ada programming language (or similar syntax).
66
Type Equivalence
Structure type equivalence means that two variables have equivalent
types if their types have identical structures
– More flexible, but harder to implement
Data Types
This rule says two variables are the same if their structure (or shape) is the same,
even if their names or ranges are different.
It's more flexible than name type equivalence, but harder to build into a
programming language.
The two variables are considered equivalent because their structure is the same —
they’re both arrays of integers.
The range of indexes doesn’t matter here because the array type is unconstrained,
which means its size can vary.
67
Type Equivalence
C uses both name and structure type equivalence.
– Name type equivalence is used for structure, enumeration, and
union types.
– Other nonscalar types use structure type equivalence. Array type are
equivalence if they the same type components. Also, if an array type
has a constant size, it is equivalent either to other arrays with the
same constant size or to with those without a constant size.
In languages that do not allow users to define and name types, such as
Fortran and COBOL, names equivalence obviously cannot be used.
Data Types
Name type equivalence is used for structure, enumeration, and union types.
→ For struct, enum, and union, two variables are the same type only if they were declared
using the same name.
Array types are equivalent if they have the same type components.
→ Arrays are considered the same if the type of their elements is the same (e.g., both are
arrays of integers).
Also, if an array type has a constant size, it is equivalent either to other arrays with the same
constant size or to those without a constant size.
→ If an array has a fixed size (like 5 elements), it's seen as the same type as other arrays with:
•the same size, or
•no fixed size.
In languages that do not allow users to define and name types, such as Fortran and COBOL,
name equivalence obviously cannot be used.
→ In some older languages like Fortran and COBOL, since you can’t create and name new
types, name-based checking isn’t possible. Only structure-based comparison is used.
68
Theory and Data Types
Type theory is a broad area of study in mathematics, logic, computer
science, and philosophy
Two branches of type theory in computer science:
– Practical – The practical branch concerned with data types in
commercial programming languages
– Abstract – The abstract branch primarily focuses on typed lambda
calculus, an area of extensive research by theoretical computer
scientist over the past half century
A data type defines a set of values and a collection of operations on
those values
A type system is a set of types and the rules that govern their use in
programs
Data Types
Type theory is a big field studied in math, logic, computer science, and philosophy.
→ Many areas of study explore how types work and how we use them.
A data type is a group of possible values and the actions you can do with them.
→ For example, an integer type includes numbers and allows math opera ons.
A type system is a collection of types and rules for how they are used in a program.
→ It makes sure you use data the right way, like not mixing strings with numbers in
math.
69
Thank you.
70