0% found this document useful (0 votes)
54 views9 pages

Application Layer (Chapter 7) - CSHub

The document summarizes key concepts related to application layer protocols: 1. DNS translates domain names to IP addresses in a hierarchical structure managed by registrars. URLs make it easier for users to access servers that may change IP addresses. 2. Email uses SMTP for sending messages between users and servers in an envelope-header-body format. POP3 and IMAP allow users to interact with their mailboxes on servers. 3. The World Wide Web is developed and standardized by W3C. HTTP uses URLs and persistent connections over TCP for efficient retrieval of web resources in different MIME types, like content delivery through CDNs.

Uploaded by

Speed Piano
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views9 pages

Application Layer (Chapter 7) - CSHub

The document summarizes key concepts related to application layer protocols: 1. DNS translates domain names to IP addresses in a hierarchical structure managed by registrars. URLs make it easier for users to access servers that may change IP addresses. 2. Email uses SMTP for sending messages between users and servers in an envelope-header-body format. POP3 and IMAP allow users to interact with their mailboxes on servers. 3. The World Wide Web is developed and standardized by W3C. HTTP uses URLs and persistent connections over TCP for efficient retrieval of web resources in different MIME types, like content delivery through CDNs.

Uploaded by

Speed Piano
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Application Layer (chapter 7)

Domain Name System (DNS)


DNS is the service that translates URLs to IP addresses.

URLs
Why URLs?
• Remembering IP addresses is annoying
• Servers may change IP address

What are URLs?


• Hierarchical data structure, where each node manages (owns) their children
• Top level domains are either generic or countries, they are run by registrars who you can
pay a small annual fee in exchange for ownership of the URL

DNS name space


If you control a domain, you can specify arbitrary subdomains. The UK uses .ac.uk. for academic
use and .co.uk. for commercial use, whereas the Netherlands puts everything directly under .nl.
There are dots behind an address to indicate it's the end.

Domain resource record


Domain name: The domain to which this record applies (Since a database will contain
information about multiple domain names)
Time to live: How often we need to check if the record is still valid. Used for caching.
Class: For the internet, always IN. May be set to other codes for other applications.
Type: The type of this record, see picture underneath for all types
Value: The meaning of this field is dependent on the type.
Name server
Location
The operating system keeps track of which name server to use. By default, it's the name server
of the autonomous system you are in. It is possible to change this if you prefer another
nameserver (e.g. 1.1.1.1). The nameserver which the operating system calls first is called the local
name server. This could actually be a local name server (e.g. your router), or a nameserver which
is a continent away.

DHCP assigns an IP address to a machine without one. Besides this, it also provides you the
address of a nameserver.

How to resolve an IP address?


DNS name servers are split up into multiple non-overlapping zones. Name servers should be
able to tell you the IP of all servers directly underneath it, these are authoritative records. They
may also cache records from further down, these are called cached records.

The IPs of name servers or records may also be cached at the local computer, by the operating
system. There are also 2 different ways of querying name servers:
• Recursive query: The local name server does further queries to other name servers
(better if you have weak machines)
• Iterative query: The machine does further queries to other name servers

One name resolution can use both mechanisms, as in the picture underneath, the local name
server is recursive while the other servers are iterative.
Email
Structure
The participating computers are split up into 2 categories:
• Users (the computers sending and receiving emails, may not always be online)
• Message transfer agents, aka mail servers (route the emails to their destination, should
always be online)

Message format
Email messages contain an envelope, header, and body.

Envelope: Meta intended for the mail server


Header: Meta intended for the user
Body: The actual sent email.
POP3 / IMAP
Users use POP3 (older) or IMAP to interact with their mailbox. These protocols receive emails
from the mailserver. POP3 tends to remove the mails from the mail server whereas IMAP does
not. Receiving mails has to be initiated by the destination, instead of the other way around in
many other protocols.

IMAP
IMAP sends commands to a mail server to manipulate mailboxes. Some common commands:
1. LOGIN, log into server
2. FETCH, fetch messages from a folder
3. CREATE/DELETE, create or delete a folder
4. EXPUNGE, remove messages marked for deletion.

IMAP replaced the POP3 protocol.

SMTP
Users and mailservers use SMTP to send email from a source to a destination. This is used
between user and mailservers (often with extensions, e.g. authentication) but also between
mailservers. Deciding upon where to send the message to is done using DNS.

Some issues:
• SMTP uses ASCII to send emails, so you can't send arbitrary binary data.
• Basic SMTP does not include authentication
• The FROM field in the header is not checked

MIME
To send other messages than just plain text, the Multipurpose Internet Mail Extensions (MIME)
exists. Using this you could send more than just text messages, where the type is indicated by a
new header. In order to support this, we can't just force all mailservers to be updated.

Base64
In order to do this, we use base64 encoding. We put in a binary string and we get ASCII.
Base64 only uses 64 characters, so 6 bits are translated into 1 character (so we just regroup
bytes into 6 bit streams). Unfortunately, this gives a 1 bit overhead per 6 bit sequence.

To indicate padding, we use = signs. We use padding when the output length in base64 is not
divisible by 4.
We then pad until the output length is divisible by 4 (so group input by 3 bytes). The final ==
sequence indicates that the last group contained only one byte, and = indicates that it
contained two bytes.
For example, if our output base64 becomes YW55 IGN , we pad with 1 =, to make it divisible by 4.
When the output is YW55 IG , we pad with ==
Some other problems / properties:
• If we already had ASCII in our message, it was already padded with a 0. Turning this into
base64 gives an even bigger overhead
• If the amount of bits is not divisible by 6, we have even a bigger overhead

World wide web


W3C
The World Wide Web Consortium (W3C) is an organization devoted to:
• Developing the web
• Standardizing protocols
• Improving compatibility between sites

HTTP
In order to query resources, we use HTTP. We send HTTP requests and get HTTP responses.
HTTP uses TCP, so before we can send a HTTP request, we need to setup a TCP connection.

We could create one connection for each HTTP request that we want to send. That is very
inefficient, so we use persistent connections to allow browsers to issue multiple requests over
the TCP connection. We can send sequential requests or pipelined requests (we send multiple
requests at the same time).

URLs
In order to locate web pages we use URLs (uniform resource locators). URLs specify a protocol
(e.g. HTTPS), domain name (interesting for the network and transport layer as this is where a
request will be sent to) and path (only interesting for HTTP, specific for the particular server).
The slashes in front of the URL are not really needed, but were used to improve readability in
the past.

MIME
In the web, the MIME type text/html is parsed. Other data is passed to a plugin or another
application. A plugin is integrated into your browser and handles MIME types that are not text
based.

CDN
Static content (e.g. JS files) can be hosted on a CDN. A Content Delivery Network is a type of
caching to increase system scalability.
An origin CDN server distributes content over multiple other CDN nodes (all over the world).
Then you make sure that users in Europe get the CDN in Europe etc. There are a few
possibilities for that:
• Use a front end to forward the requests to the right CDN node. This does mean that
there is still 1 front end which can still give delays if it's not placed well geographically
• Use DNS load balancing, you query a CDN name server which responds differently to a
request from Europe than to a request from the USA

Streaming video and audio


Compression
In order to transfer video and audio in smaller formats, we use compression. This compression
tries to remove a lot of information and still retain almost the same amount of quality (when
viewed / listened to by a human).

For video, MPEG can be used for compression. MPEG compresses over a sequence of frames,
further using motion tracking to remove temporal redundancy. There are a few techniques for
this:
• I (Intra-coded), frames are self-contained. So a frame stands on itself instead of relying on
neighbor frames
• P (Predictive), only store changes to previous frames
• B (Bidirectional), frames may base prediction on previous frames and future frames

Using a high compression rate we can cut down the size of the file by a lot

Streaming stored media


There are a few approaches of streaming stored media (e.g. YouTube):
• Sending the entire file up front, save it to disk on the destination and then read it
• Send a metafile request, hand-off the metafile to a media player (can be integrated into
the browser itself), which in turn sends a media request (using the RTSP protocol) and
gets a media response from the server

A few things can go wrong. How do we handle transmission errors?


• Use reliable transport (e.g. TCP). This increases jitter (lag) significantly, but for video that
can be prefetched (i.e. stored media) this can be a solution
• Use error correction in the application layer. This increases jitter, decoding complexity
and overhead
• Interleave media, send frames in a different order. This "protects" against burst errors,
because you only get a few slight gaps between frames instead of an entire second
going missing. Slightly increases jitter and decoding complexity

When playing media, we buffer it. There are 2 things we worry about:
• We don't want a too full buffer, then we will lose content (we have a high-water mark, if
this is reached we need to download slower)
• We don't want an empty buffer, then we have stalling (so we have a low-water mark, if
this is reached we need to download quicker)

Streaming live media


Streaming live media (e.g. Twitch) is similar to the stored variant, but...:
• You can't stream faster than live rate to prefetch and thus get ahead. So we need a larger
buffer, we can't just increase speed of transmission!
• There are often many users viewing at the same time. For this we use UDP with multicast
(send the same to everyone instead of making connections for each person), which
greatly improves efficiency. This is rarely available though, so TCP connections are still
used.

Streaming interactive media


Real-time conferencing has two or more connected live streams. The key challenge here is
achieving as low latency as possible, we want to get immediate responses.

A partial solution to this is to change the compression rate based on the bandwidth:
• If the bandwidth decreases more lossy compression
• If the bandwidth increases less lossy compression

Peer-to-peer systems
Instead of relying on a central infrastructure, users can create their own infrastructure by
connecting to each other. This is naturally very scalable (to a certain extent), whereas a
centralized infrastructure isn't.

Napster
The original idea of Napster was the following:
• A user asks a central server who has a certain file
• The server responds with the IP address of a machine having that file
• The user requests that machine for the file
• The file gets returned

The problem with this is that there is still a central server, and because Napster was prone to
pirating this central server was shut down.

BitTorrent
The idea of BitTorrent:
• You download the description of a file you want to get (from PirateBay for example)
• You connect to the tracker server, which provides you with a list of peers (people who
serve chunks of the files)
• You seed (also become the host) and download chunks from the peers

The disadvantage of this is that the tracker still needs to be online. In modern versions of
BitTorrent this tracker is distributed.

Distributed Hash Tables (DHTs)


We need to distribute the information which is normally on the tracker over many users. We
want:
• Only a small amount of information per node
• A fast lookup
• Concurrent use by all users

In order to distribute a hash table, we use a Chord Distributed Hash Table:


1. Create a ring of 2m places. Every place can hold a user
2. The location in the ring can be calculated by hash (IP )
3. Peer for torrent t are stored at user successor (hash (t)) (next node in the ring, if the
position of the hash is not taken)
4. Each node connects with the previous and next nodes

So imagine places 7,  17,  24,  30 are taken and hash (t) returns 12, we store the data at node
17.

In general, you need to query half of the ring to find a file in this way. That's not efficient in a real
situation, so we need a better structure.

So, besides only keeping track of neighbors, we also keep track of other nodes in a table: the i
th entry of a table keeps track of the successor (location + 2i ) node. In total, it keeps track of
m nodes (the table contains 2m places)
Here, we define successor (x) as the node address y for which (y − x) mod 2m is minimal (or:
first node you get when walking clock-wise along the ring starting at x).

Example
Imagine we have the following ring (where m = 5):
We then get the following finger table for location 3 :
m Start location + 2m Address of successor
0 3 + 20 = 4 7
1 3 + 21 = 5 7
2 3 + 22 = 7 7
3 3 + 23 = 11 17
4 3 + 24 = 19 24

Now imagine we are looking for node 27 starting from node 3:


• we need to find the closest node in our finger table to 27, which is 24
• Then 24 is contacted by 3, asking for the information. The successor of 27 is node 30, so
node 24 contacts node 30
• Node 30 returns the information for 27

You might also like