CSC 304 Note Edc
CSC 304 Note Edc
In computer science, primitive data types are a set of basic data types from which all other data types
are constructed. Specifically, it often refers to the limited set of data representations in use by a
particular processor, which all compiled programs must use. Most processors support a similar set of
primitive data types, although the specific representations vary. More generally, "primitive data types"
may refer to the standard data types built into a programming language (built-in types). Data types
which are not primitive are referred to as derived or composite.
The most common primitive types are those used and supported by computer hardware, such as
integers of various sizes, floating-point numbers, and Boolean logical values. Operations on such types
are usually quite efficient. Primitive data types which are native to the processor have a one-to-one
correspondence with objects in the computer's memory, and operations on these types are often the
fastest possible in most cases. Integer addition, for example, can be performed as a single machine
instruction, and some offer specific instructions to process sequences of characters with a single
instruction. But the choice of primitive data type may affect performance, for example it is faster using
SIMD (single instruction, multiple data) operations and data types to operate on an array of floats.
Integer numbers
An integer data type represents some range of mathematical integers. Integers may be either signed
(allowing negative values) or unsigned (non-negative integers only). Common ranges are:
Floating-point numbers.
A floating-point number represents a limited-precision rational number that may have a fractional part.
These numbers are stored internally in a format equivalent to scientific notation, typically in binary but
sometimes in decimal. Because floating-point numbers have limited precision, only a subset of real or
rational numbers are exactly representable; other numbers can be represented only approximately.
Many languages have both a single precision (often called "float") and a double precision type (often
called "double").
Booleans.
A Boolean type, typically denoted "bool" or "Boolean", is typically a logical type that can have either the
value "true" or the value "false". Although only one bit is necessary to accommodate the value set
"true" and "false", programming languages typically implement boolean types as one or more bytes.
Many languages (e.g. Java, Pascal and Ada) implement Booleans adhering to the concept of boolean as a
distinct logical type. Some languages, though, may implicitly convert booleans to numeric types at times
to give extended semantics to booleans and boolean expressions or to achieve backwards compatibility
with earlier versions of the language. For example, early versions of the C programming language that
followed ANSI C and its former standards did not have a dedicated boolean type. Instead, numeric
values of zero are interpreted as "false", and any other value is interpreted as "true”. The newer C99
added a distinct boolean type _Bool (the more intuitive name bool as well as the macros true and false
can be included with stdbool.h), and C++ supports bool as a built-in type and "true" and "false" as
reserved words.
Specific languages.
Java.
The Java virtual machine's set of primitive data types consists of:
byte, short, int, long, char (integer types with a variety of ranges)
float and double, floating-point numbers with single and double precisions
returnAddress, a value referring to an executable memory address. This is not accessible from
the Java programming language and is usually left out.
C basic types.
The set of basic C data types is similar to Java's. Minimally, there are four types, char, int, float, and
double, but the qualifiers short, long, signed, and unsigned mean that C contains numerous target-
dependent integer and floating-point primitive types. C99 extended this set by adding the boolean type
_Bool and allowing the modifier long to be used twice in combination with int (e.g. long long int).
XML Schema.
The XML Schema Definition language provides a set of 19 primitive data types:
boolean: a boolean
duration, dateTime, time, date, gYearMonth, gYear, gMonthDay, gDay, and gMonth: Calendar
dates and times
anyURI: a URI
NOTATION: a QName declared as a notation in the schema. Notations are used to embed non-
XML data types.[18] This type cannot be used directly - only derived types that enumerate a
limited set of QNames may be used.
JavaScript.
In JavaScript, there are 7 primitive data types: string, number, bigint, boolean, undefined, symbol, and
null. These are not objects and have no methods.
In Visual Basic .NET, the primitive data types consist of 4 integral types, 2 floating-point types, a 16-byte
decimal type, a boolean type, a date/time type, a Unicode character type, and a Unicode string type. [20]
Rust.
Rust has primitive unsigned and signed fixed width integers in the format u or i respectively followed by
any bit width that is a power of two between 8 and 128 giving the types u8, u16, u32, u64, u128, i8, i16,
i32, i64 and i128. Also available are the types usize and isize which are unsigned and signed integers that
are the same bit width as a reference with the usize type being used for indices into arrays and
indexable collection types.
char for a unicode character. Under the hood these are unsigned 32-bit integers with values that
correspond to the char's codepoint but only values that correspond to a valid unicode scalar
value are valid.[21]
Built-in types.
Built-in types are distinguished from others by having specific support in the compiler or runtime, to the
extent that it would not be possible to simply define them in a header file or standard library module.
Besides integers, floating-point numbers, and Booleans, other built-in types include:
The void type and null pointer type nullptr_t in C++ and C23
Complex number in C99, Fortran, Common Lisp, Python, D, Go. This is two floating-point
numbers, a real part and an imaginary part.
Associative arrays, records, and/or sets in Perl, PHP, Python, Ruby, JavaScript, Lua, D, Go
Symbols, in Lisp
First-class function, in all functional languages, JavaScript, Lua, D, Go, and in newer standards of
C++, Java, C#, Perl
A character type is a type that can represent all Unicode characters, hence must be at least 21 bits wide.
Some languages such as Julia include a true 32-bit Unicode character type as primitive. Other languages
such as JavaScript, Python, Ruby, and many dialects of BASIC do not have a primitive character type but
instead add strings as a primitive data type, typically using the UTF-8 encoding. Strings with a length of
one are normally used to represent single characters.
Some languages have "character" types that are too small to represent all Unicode characters. These are
more properly categorized as integer types that have been given a misleading name. For example C
includes a char type, but it is defined to be the smallest addressable unit of memory, which several
standards (such as POSIX) require to be 8 bits. Recent versions of these standards refer to char as a
numeric type. char is also used for a 16-bit integer type in Java, but again this is not a Unicode character
type.
The term "string" also does not always refer to a sequence of Unicode characters, instead referring to a
sequence of bytes. For example, x86-64 has "string" instructions to move, set, search, or compare a
sequence of items, where an item could be 1, 2, 4, or 8 bytes long.
For example, an array of ten 32-bit (4-byte) integer variables, with indices 0 through 9, may be stored as
ten words at memory addresses 2000, 2004, 2008, ..., 2036, (in hexadecimal: 0x7D0, 0x7D4, 0x7D8, ...,
0x7F4) so that the element with index i has the address 2000 + (i × 4).[4] The memory address of the
first element of an array is called first address, foundation address, or base address.
Because the mathematical concept of a matrix can be represented as a two-dimensional grid, two-
dimensional arrays are also sometimes called "matrices". In some cases, the term "vector" is used in
computing to refer to an array, although tuples rather than vectors are the more mathematically correct
equivalent. Tables are often implemented in the form of arrays, especially lookup tables; the word
"table" is sometimes used as a synonym of array.
Arrays are among the oldest and most important data structures, and are used by almost every
program. They are also used to implement many other data structures, such as lists and strings. They
effectively exploit the addressing logic of computers. In most modern computers and many external
storage devices, the memory is a one-dimensional array of words, whose indices are their addresses.
Processors, especially vector processors, are often optimized for array operations.
Arrays are useful mostly because the element indices can be computed at run time. Among other things,
this feature allows a single iterative statement to process arbitrarily many elements of an array. For that
reason, the elements of an array data structure are required to have the same size and should use the
same data representation. The set of valid index tuples and the addresses of the elements (and hence
the element addressing formula) are usually, but not always, fixed while the array is in use.
The term "array" may also refer to an array data type, a kind of data type provided by most high-level
programming languages that consists of a collection of values or variables that can be selected by one or
more indices computed at run-time. Array types are often implemented by array structures; however, in
some languages they may be implemented by hash tables, linked lists, search trees, or other data
structures.
The term is also used, especially in the description of algorithms, to mean associative array or "abstract
array", a theoretical computer science model (an abstract data type or ADT) intended to capture the
essential properties of arrays.
History.
The first digital computers used machine-language programming to set up and access array structures
for data tables, vector and matrix computations, and for many other purposes. John von Neumann
wrote the first array-sorting program (merge sort) in 1945, during the building of the first stored-
program computer. Array indexing was originally done by self-modifying code, and later using index
registers and indirect addressing. Some mainframes designed in the 1960s, such as the Burroughs B5000
and its successors, used memory segmentation to perform index-bounds checking in hardware.
Assembly languages generally have no special support for arrays, other than what the machine itself
provides. The earliest high-level programming languages, including FORTRAN (1957), Lisp (1958), COBOL
(1960), and ALGOL 60 (1960), had support for multi-dimensional arrays, and so has C (1972). In C++
(1983), class templates exist for multi-dimensional arrays whose dimension is fixed at runtime as well as
for runtime-flexible arrays.
Applications.
Arrays are used to implement mathematical vectors and matrices, as well as other kinds of rectangular
tables. Many databases, small and large, consist of (or include) one-dimensional arrays whose elements
are records.
Arrays are used to implement other data structures, such as lists, heaps, hash tables, queues, stacks,
strings, and VLists. Array-based implementations of other data structures are frequently simple and
space-efficient (implicit data structures), requiring little space overhead, but may have poor space
complexity, particularly when modified, compared to tree-based data structures (compare a sorted
array to a search tree).
One or more large arrays are sometimes used to emulate in-program dynamic memory allocation,
particularly memory pool allocation. Historically, this has sometimes been the only way to allocate
"dynamic memory" portably.
Arrays can be used to determine partial or complete control flow in programs, as a compact alternative
to (otherwise repetitive) multiple IF statements. They are known in this context as control tables and are
used in conjunction with a purpose built interpreter whose control flow is altered according to values
contained in the array. The array may contain subroutine pointers (or relative subroutine numbers that
can be acted upon by SWITCH statements) that direct the path of the execution.
When data objects are stored in an array, individual objects are selected by an index that is usually a
non-negative scalar integer. Indexes are also called subscripts. An index maps the array value to a stored
object.
There are three ways in which the elements of an array can be indexed:
0 (zero-based indexing)
1 (one-based indexing)
n (n-based indexing)
The base index of an array can be freely chosen. Usually programming languages allowing n-based
indexing also allow negative index values and other scalar data types like enumerations, or characters
may be used as an array index.
Using zero based indexing is the design choice of many influential programming languages, including C,
Java and Lisp. This leads to simpler implementation where the subscript refers to an offset from the
starting position of an array, so the first element has an offset of zero.
Arrays can have multiple dimensions, thus it is not uncommon to access an array using multiple indices.
For example, a two-dimensional array A with three rows and four columns might provide access to the
element at the 2nd row and 4th column by the expression A. in the case of a zero-based indexing
system. Thus two indices are used for a two-dimensional array, three for a three-dimensional array, and
n for an n-dimensional array.
The number of indices needed to specify an element is called the dimension, dimensionality, or rank of
the array.
In standard arrays, each index is restricted to a certain range of consecutive integers (or consecutive
values of some enumerated type), and the address of an element is computed by a "linear" formula on
the indices.
There are majorly three types of arrays
3. Three-dimensional array
You can imagine a 1d array as a row, where elements are stored one after another.
1D array
data_type array_name[array_size];
where,
array_name: is the name of the array using which we can refer to it.
For Example
int nums[5];
2. Two-dimensional (2D) array:
Multidimensional arrays can be considered as an array of arrays or as a matrix consisting of rows and
columns.
2D array
data_type array_name[sizeof_1st_dimension][sizeof_2nd_dimension];
where,
array_name: is the name of the array using which we can refer to it.
sizeof_dimension: is the number of blocks of memory array going to have in the corresponding
dimension.
For Example
int nums[5][10];
3. Three-dimensional array:
A 3-D Multidimensional array contains three dimensions, so it can be considered an array of two-
dimensional arrays.
3D array
data_type array_name[sizeof_1st_dimension][sizeof_2nd_dimension][sizeof_3rd_dimension];
where,
array_name: is the name of the array using which we can refer to it.
sizeof_dimension: is the number of blocks of memory array going to have in the corresponding
dimension.
For Example
int nums[5][10][2];
Difference Table:
Store a single list of the element Store a ‘list of lists’ of the element of a
Definition
of a similar data type. similar data type.
int arr[5]; //an array with one int arr[2][5]; //an array with two rows and
{a , b , c , d , e} f g h i j
STRINGS
In most programming languages, strings are treated as a distinct data type. This
means that strings have their own set of operations and properties. They can be
declared and manipulated using specific string-related functions and methods.
String Operations