0% found this document useful (0 votes)
1 views50 pages

Input and output

Chapter 2 covers input and output streams in Java, detailing the distinction between InputStream and OutputStream for reading and writing byte sequences, as well as the use of Reader and Writer classes for Unicode. It discusses various types of streams, including DataInputStream and DataOutputStream for binary data, and the importance of character encoding when handling text. Additionally, the chapter explains object serialization for saving and loading objects, along with file manipulation methods for creating, copying, and deleting files.

Uploaded by

nghia250803
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views50 pages

Input and output

Chapter 2 covers input and output streams in Java, detailing the distinction between InputStream and OutputStream for reading and writing byte sequences, as well as the use of Reader and Writer classes for Unicode. It discusses various types of streams, including DataInputStream and DataOutputStream for binary data, and the importance of character encoding when handling text. Additionally, the chapter explains object serialization for saving and loading objects, along with file manipulation methods for creating, copying, and deleting files.

Uploaded by

nghia250803
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 50

Chapter 2: Input and Output

2.1 Input/Output Streams


-In Java API, an object which we can read a sequence of
bytes is input stream. An object to which we can write a
sequence of bytes is output stream.
-These sources and destinations of byte sequences are
files, network connections, blocks of memory.
-Abstract classes InputStream and OutputStream are
basis for a hierarchy of I/O classes.
-Note: These I/O streams are unrelated to streams in
Chap 1.
-Byte-oriented I/O streams are inconvenient for
processing info in Unicode (use multiple bytes per
character). A separate hierarchy provides classes
inheriting from Reader and Writer classes for processing
Unicode. Their operations are based on 2-byte char value
2.1.1 Reading and Writing Bytes
-InputStream has 1 abstract method:

+This method reads one bytes and returns it or -1 when


encounter the end.
+System.in is a predefined object of a subclass of
InputStream that read info from standard input-console
or redirected file.
+This class also has nonabstract methods to read an
array of bytes or skip bytes. Read all bytes of a stream:

-OutputStream defines 1 abstract method:


write 1 byte to an output location
+Write an array of bytes:

+transferTo(): transfer all bytes from an input stream to


an output stream
-Both read() and write() block until the byte is actually
read or written. This gives other threads the chance to do
useful work.
+available(): check the number of bytes that are
currently available for reading:
-When finish read or write to I/O stream, close it by
close(). If you don’t close a file, the last packet of bytes
might never be delivered. You can also flush() ouput.
-But the data that programmers are interested in are not
raw bytes but numbers, strings, objects
2.1.2 The Complete Stream Zoo
-Java have a whole zoo of >60 different I/O stream types.
-Process bytes and characters:
-For Unicode text: Reader and Writer
+The basic methods:

+read() returns UTF-16 code unit (0-65535) or -1. Write()


is called with a Unicode code unit.
-There are 4 additional interfaces: Closeable, Flushable,
Readable, Appendable
+Closeable and Flushable has: and
.
+InputStream, OutputStream, Reader and Writer all
implement Closeable interface.
+Note: Closeable interface extends AutoCloseable
interface. So you can use try-with-resources with any
Closeable.
+OutputStream and Writer implement Flushable.
+Readable have :
+CharBuffer has methods for sequential and random
read/write access. It represents an in-memory buffer or a
memory-mapped file (2.5.2)
+Appendable has 2 methods for appending single
character and character sequences:

+CharSequence describes basic properties of a sequence


of char values. It’s implemented by String, CharBuffer,
StringBuilder, StringBuffer.
+Only Writer implements Appendable.
-API: p62
2.1.3 Combining Input/Output Stream Filters
-FileInputStream and FileOutputStream give input and
output streams attached to a disk file. You need to pass
file name or full path name:
+Tip: All classes in java.io interpret relative path names
from user’s working directory. You can get this by
System.getProperty(“user.dir”)
+Caution: Use \\ for Windows path names.
+Like InputStream and OutputStream, these classes only
support reading and writing at byte level.
-DataInputStream just read numeric types:

-Java uses a clever mechanism to combine the 2.


+Example: read number from a file. 1st create
FileInpuStream and pass it to DataInputStream

-You can add multiple capabilities by nesting the filters:


+Be default, input streams are not buffered. Every read()
asks the OS to dole out yet another byte. If you want
buffering and the data input methods for a file:

+We put the DataInputStream last in the chain because


we want to use DataInputStream methods.
+Sometimes you need to keep track of input streams.
PushbackInputStream can peeak at the next byte:

+reading and unreading are the only methods that apply


to push-back input stream.
+If you want to look ahead and also read number, use
both a pushback and data input stream:
-The ability to mix and match filter classes to construct
useful sequence of input/output stream gives an
immense amount of flexibility:

-API: p67
2.1.4 Text Input and Output
-When saving data, you choice between binary and text
formats. Although binary I/O is fast and efficient, it’s
readable by humans.
-When saving text strings, consider character encoding:
+Java use UTF-16 encoding internally. Example: string
“Jose” is encoded as “00 4A 00 6F 00 73 00 E9” in hex.
+Many programs expect that text files use a different
encoding. UTF-8 is the encoding most used on I.
-OutputStreamWriter class turns an output stream of
Unicode code units into a stream of bytes, using a chosen
character encoding. InputStreamReader turn an input
stream that contains bytes into a reader that emits
Unicode code units.
+ make an input reader that reads
from console and convert them to Unicode.
+The reader assumes the default character encoding
used by the host system (Windons 1252, MacRoman).
You should always choose a specific encoding:

-Reader and Writer classes only read and write individual


characters. You use subclasses for strings and numbers.
2.1.5 How to Write Text Output
-For text output, use PrintWriter to print strings and
numbers in text format. To print to a file, construct a
PrintStream from a file name and character encoding

+Write to a print writer: print(), println(), printf() to print


numbers, characters, boolean, strings, objects
+This writers the characters to the writer out. They are
then converted to bytes and end up in file.
-By default, autoflush is not enabled (all characters I
buffer are sent to destination whenever println()). You
can enable or disable by

2.1.6 How to Read Text Input


-The easiest way to process text is Scanner class.
-You can read a short text file into a string:

-Make the file as a sequence of lines:

-If the file is large, process the lines lazily as


Stream<String>

-You can use a scanner to read tokens-strings that are


separated by a delimiter.

accepts any non-Unicode as delimiters.


+next() yields the next token:
+Alternatively, you can obtain a stream of all tokens as:

-In early versions of Java, the only way for processing text
input was BufferedReader class:

2.1.7 Saving Objects in Text Format


-Example: store an array of Employee records in a text
file, each record is stored in a separate line. Instance
fields are separated from each other by delimiters.
+Write records:

+Read records:
2.1.8 Character Encodings
-Input and output streams are for sequences of bytes,
but texts are sequences of characters. It’s matters how
characters are encoded into bytes
-Java uses Unicode standard for characters. Each
character (or code point) has a 21-bit integer number.
There are different character encodings-methods for
packaging those 21-bit numbers into bytes.
+The most common encoding is UTF-8, which encodes
each Unicode code point into a sequence of one to four
bytes.
+Another common encoding is UTF-16, which encodes
each Unicode code point into one or two 16-bit values.
This is the encoding used in Java strings.

-There are partial encodings that cover a character range


suitable for a given user population: ISO 8859-1, Shift-JIS
-There is no reliable way to automatically detect the
character encoding from a stream of bytes. Some API
methods let you use the “default charset”-the character
encoding preferred by OS of computer. You should
always explicitly the encoding. For example: when
reading a web page, check Content-Type header.
-Note: Charset.defaultCharset: platform encoding,
Charset.availableCharsets returns all available Charset
instances, as a map from canonical names to Charset
objects.
-StandardCharsets class has static variable of type
Charset for the character encodings JVM must support:
+To obtain the Charset for another encoding, use static
forName():
+Use Charset object when reading or writing text.
Example: you can turn an array of bytes into a string as:

2.2 Reading and Writing Binary Data


-Text format is convenient for testing and debugging, but
it’s not as efficient as transmitting data in binary format.
2.2.1 The DataInput and DataOuput interfaces
-DataOutput interface defines methods for writing
number, character, boolean value or string in binary:

-Read the data with DataInput interface:


-To read binary data from a file, combine a
DataInputStream with a source of bytes such as
FileInputStream:

-Write binary data use DataOutputStream:

2.2.2 Random-Access Files


-RandomAccessFile class read or write data anywhere in
a file. Disk files are random-access, but i/o streams that
communicate with network socket are not.
+You can open random-access file either reading only or
for both reading and writing:

+When open existing file as RandomAccessFile, it does


not get deleted.
+It has a file pointer indicates the position of the next
byte. seek() set the file pointer to byte position.
getFilePointer() returns the current position of file
pointer.
-RandomAccessFile class implements both DataInput and
DataOutput. Use methods such as readInt/writeInt and
readChar/writeChar.
2.4.1 Paths
-A Path is a sequence of directory names, optionally
followed by a file name.
+1st component may be a root component (absolute or
relative)

+Paths.get() receives one or more strings, which is joins


with the path separator of the default file system (/ for
UNIX-like file system, \ for Windows). The result is Path
object or InvalidPathException
+get() can get a single string containing multiple
components. Example: read a path from a configuration
file:

-It’s very common to combine or resolve paths:

+p.resolve(q): if q is absolute, the result is q. Otherwise,


the result is “p then q”.
+Shortcut for resolve() takes a string instead of path:
+resolveSibling(): resolves against a path’s parent,
yielding a sibling path:
-The opposite of resolve() is relativize(): p.relativize(r)
yields the path q, when resolved with p, yields r.

-normalize() remove any redundant . and .. components:

-toAbsolutePath() yields the absolute path, start at a


root component.
- More useful methods of Path interface:

-You can construct a Scanner from a Path object

-Note: Path has toFile() and File class has toPath()


-API: p111
2.2.3 ZIP Archives
-Zip archives store files in a compressed format. Each ZIP
archive has a header with useful information.
-You use ZipInputStream to read ZIP archive.
getNextEntry() returns a ZipEntry object that describes
the entry. Here is the typical code sequence to read
through a ZIP file:

-Write Zip file, use ZipOutputStream. For each entry,


create a ZipEntry object. Here is a code skeleton:

-Note: JAR files are simply ZIP files with a special entry:
manifest. Use JarInputStream and JarOutputStream to
read and write manifest entry.
-API: p86
2.3 Object Input/Output Streams and
Serialization
-Using a fixed-length record format is a good choice to
store data of the same type. We don’t have data format
to store polymorphic collections.
-Java supports general mechanism: object serialization to
write any object to an output stream and read it later.
2.3.1 Saving and Loading Serializable Objects
-To save object data, 1st open ObjectOutputStream:

+use writeObject()

-To read object, get ObjectInputStream object:

+Use readObject()

-The class must implement the Serializable interface:


-Note: Write and read only objects. For primitive type
values, use writeInt/readInt or writeDouble… The object
i/o stream implement DataInput/Dataoutput
-ObjectOutputStream looks at all the fields of the objects
and saves their contents.
+Example: write an Employee object, the name, date and
salary fields are written to output stream.
+What happens when one object is shared by several
objects as part of their state? Each object is saved with
the serial number-hence the name object serialization.
-Writing the object:
+Associate a serial number with each object reference
+When encountering an object reference 1st time, save
object data to output stream.
+If it has been saved, write same as the save object with
serial number x.
-Reading object:
+When object is specified in an object input stream 1st
time, construct it, initialize it and remember the
association serial number + object reference.
+When the tag “same as saved object with serial number
x” is encountered, retrieve the object reference.
-Note: In this chapter, we use serialization to save
collection of objects to a disk file and retrieve it. Another
very important app is the transmittal of a collection of
objects across a network connection.
-API: p93
2.3.2 Understanding the Object Serialization
File Format
-Object serialization saves object data in a particular file
format. Studying the data format extremely helpful for
gaining insight into the object serialization process.
<Continue Later>
-What you should remember is:
+The serialized format contains the types and data fields
of all objects.
+Each object is assigned a serial number.
+Repeated occurrences of the same object are stored as
references to that serial number.
2.3.3 Modifying the Default Serialization
Mechanism
-Java has an easy mechanism to prevent fields from
being serialized: use keyword transient
-The serialization mechanism provides a way for
individual classes to add validation to the default read
and write behavior. A serializable class can define
methods with the signature:

+Then the data fields are no longer automatically


serialized, these methods are called instead.
-Instead of letting the serialization mechanism save and
restore object data, a class can define its own mechanism
= implement Externalizable interface:

+These methods are for saving and restoring the entire


object, including the superclass data.
+Example Implements these methods for Employee class:
2.3.4 Serializing Singletons and Typesafe
Enumerations
-If you use enum construct, don’t worry serialization.
-Suppose you maintain legacy code contains enumerated
type such as:

will fail. The saved value


is a new object that is not equal to any of predefined
constants.
+To solve this problem, define another special
serialization method readResolve:
+Remember to add readResolve() to all typesafe
enumeration in legacy code and to all classes that follow
the singleton design pattern.
2.3.5 Versioning

2.3.6 Using Serialization for Cloning


-An amusing use for serialization mechanism: clone an
object. Simply serialize it to an output stream and then
read it back in. Instead, you can use
ByteArrayOutputStream to save data into byte array.
2.4 Working with Files
-Path interface and Files class encapsulate the
functionality to work with file system on user’s machine.
2.4.2 Reading and Writing Files
-Read the entire content of a file:

+Another read the content of a text file:

+Read the sequence of lines:

-Write a string:

+Append to a file:

+Write a collection of line:


-These methods deals with text files of moderate length.
If the file are large or binary, use familiar i/o streams:
-API: p113
2.4.3 Creating Files and Directories
-Create a new directory: , the last
component in the path must already exist
+Create intermediate directories:

-Create an empty file: throws an


exception if the file already exists.
-Create a temporary file or directory in a given or system-
specific location:

+dir is a Path, prefix/suffix may be null. Example:


might return
-API: p114
2.4.4 Copying, Moving, and Deleting Files
-Copy a file:
+Move the file (copy and delete):
+It’ll fail if the target exists. If you want to overwrite, use
REPLACE_EXISTING. Copy all file attributes:
COPY_ATTRIBUTES

+Specify an atomic move: ATOMIC_MOVE

+You can copy an input stream to a Path, which means


saving the input stream to disk. Or copy a Path to output
stream:

-Delete a file:
+This method throws an exception if file doesn’t exist
+Another delete:
+These methods can be used to remove empty directory.
-API: p115
2.4.5 Getting File Information
-Static methods return boolean value to check a property
of a path:
-size(): return number of bytes in a file

-getOwner(): return the owner of file, as an instance of


java.nio.file.attribute.UserPrincipal
-All file systems report basic attributes by
BasicFileAttributes interface. They are:

+Get these attributes:

+The user’s file system is POSIX-compliant, you can


instead get an instance of PosixFileAttributes:

-API: p117
2.4.6 Visiting Directory Entries
-Files.list(): return Stream<Path> reads the entries of a
directory. Use a try block:
+list() doesn’t enter subdirectories. Use Files.walk()

+Limit the depth of the tree by


-Note: file filter use find() instead of walk()
-Example: copy one directory to another

-You can’t use Files.walk() to delete a tree of directories


since you need to delete the children first.
2.4.7 Using Directory Streams
-Files.newDirectoryStream() yields a DirectoryStream. It’s
not a subinterface of java.util.stream.Stream but an
interface that is specialized for directory traversal.
+It is a subinterface of Iterable so you can use for():
+Filter the files with a glob pattern:

+Caution: Use glob syntax on Windows, use escape back-


slashes twice.
-To visit all descendants of a directory, use walkFileTree()
and supply an object FileVisitor. This object gets notified:
+In each case, specify whether you want to:

+If any method throws exception, the walk is terminated


+Class SimpleFileVisitor implements FlieVisitor interface.
visitFileFailed() throws exception caused the failure.
-Example: print out all subdirectories:
+We need to override postVisitDirectory() and
visitFileFailed().
+The attributes of the path are passed since OS needs to
distinguish between files and directories.
-The other methods of FileVisitor interface are useful if
you work when entering or leaving a directory. Example:
delete a directory tree:
-API: p123
2.4.8 ZIP File Systems
-Paths class looks up paths in default file system-the files
on the user’s local disk.
-Useful one is a ZIP file system:
+ establish a
file system that contains all files in ZIP archive.
+Copy a file out of the archive:

+List all files in ZIP archive:

-API: p124
2.5 Memory-Mapped Files
-Most OS can take advantage of a virtual memory
implementation to map a file or a region of a file into
memory.
2.5.1 Memory-Mapped File Performance

-Use memory mapping for huge files:


+Get a channel for the file. It is an abstraction for a disk
file that lets you access OS features such as memory
mapping, file locking and fast data transfer between files

+Get a ByteBuffer from the channel by map(). Specify


the area of the file that you want to map and a mapping
mode:
+You can read and write data using ByteBuffer class and
Buffer superclass. Buffer support both sequential and
random data access:

+Read and write arrays of bytes:

+Read primitive-type values:

+Java uses big-endian ordering for binary data. You can


use little-endian order:
+Find out the current byte order of a buffer:

+Write numbers to a buffer:

-API: p130
2.5.2 The Buffer Data Structure
-A buffer is an array of values of the same type. Buffer
class is an abstract class with concrete subclasses
ByteBuffer, CharBuffer, DoubleBuffer, FloatBuffer,
IntBuffer, LongBuffer, ShortBuffer.
+Note: StringBuffer class isn’t related to these buffers
-A buffer has:

+The values fulfill the condition:

+The purpose of a buffer is a “write, then read” cycle. At


the outset, the position is 0 and the limit is capacity.
put() add values to the buffer. When you run out of data
or reach the capacity, switch to reading.
+flip() set the limit to the current position and the
position to 0. Keep calling get() while remaining() (return
limit-position) is positive. When you read all values in
buffer, clear() to prepare the buffer for the next writing
cycle. clear() reset the position to 0 and limit to capacity
+Reread the buffer, use rewind() or mark/reset
+To get a buffer, call static method: ByteBuffer.allocate
or ByteBuffer.wrap. Then you can fill a buffer from a
channel or write its contents to a channel:

-API: p134
2.6 File Locking
- A file lock controls access to a file or a range of byte
within a file in multiple simultaneously executing.
-Lock a file: lock() or tryLock() of FileChannel class

+The file lock until the channel is closed or release() is


invoked on the lock.
-Lock a portion of the file:

+shared flag is false to lock the file for both reading and
writing. It’s true for a shared lock (allows multiple
processes to read from the file, while preventing any
process from acquiring an exclusive lock). Call isShared()
-Be sure to unlock the lock when you are done. This is
best done with a try-with resources:

-File locking is system-dependent. Some points to watch:

-API: p136
2.7 Regular Expressions
-Regular expressions are used to specify string patterns.
You can use them to locate strings that match a
particular pattern.
-Check regex: https://regex101.com/
2.7.1 The Regular Expression Syntax
2.7.2 Matching a String
-The simplest use for regex is to test whether a string
matches it
+Construct a Pattern object from a string containing the
regex. Then get a Matcher object from the Pattern and
call matches():
+The input of the matcher is an object of class
implements CharSequence interface (String,
StringBuilder, CharBuffer).
+You can set flags to pattern:

Or specify them inside the pattern

+The flags:
-Match elements in a collection or stream, turn the
pattern into a predicate:

+The result contains all strings match the regex.


-If the regex contains group:

-Extract the matched string:


-Group 0 is entire input, group index for 1st actual group
is 1. Call groupCount() get the total group count.
-For named groups:

-Nested groups are ordered by opening parentheses

2.7.3 Finding Multiple Matches


-Usually, you find one or more matching substrings:
+find() of Matcher to find the next match. If it returns
true, use start() and end() to find the extent of the match
or group() without an argument to get matched string.

+In this way, you can process each match in turn.


+Or use results() to get a Stream<MathResult>. Example:
get a list of all matches:

-If you have the data in a file, use Scanner.findAll() to get


a Stream<MatchResult>

2.7.4 Splitting along Delimiters


-Pattern.split():break an input along matched delimiters

+If there are many tokens, fetch them lazily:


+Use String.split() if you don’t care about precompiling
the pattern or lazy fetching:
-If the input is in a file, use a scanner:

2.7.5 Replacing Matches


-replaceAll() of Matcher class: replace all occurrences of
a regex with a replacement string

+$n is replaced with nth group, ${name} is replaced with


the name group. \$ include $ character.
+ : include \ and $
-For complex operation: provide a replacement function

-replaceFirst(): replace 1st occurrence


-API: p148

You might also like