Glusterfs Documentation: Release 3.8.0
Glusterfs Documentation: Release 3.8.0
Release 3.8.0
Gluster Community
3 Installation Guide 23
3.1 Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3 Installing Gluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.4 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.5 Quick Start Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.6 Setup Baremetal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.7 Deploying in AWS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.8 Setting up in virtual machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4 Administrator Guide 33
5 Upgrade Guide 35
6 Contributors Guide 37
6.1 Adding your blog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.2 Bug Lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.3 Bug Reporting Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.4 Bug Triage Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
7 Changelog 47
8 Presentations 49
i
ii
GlusterFS Documentation, Release 3.8.0
GlusterFS is a scalable network filesystem. Using common off-the-shelf hardware, you can create large, distributed
storage solutions for media streaming, data analysis, and other data and bandwidth-intensive tasks. GlusterFS is free
and open source software.
Contents 1
GlusterFS Documentation, Release 3.8.0
2 Contents
CHAPTER 1
For this tutorial, we will assume you are using Fedora 22 (or later) virtual machine(s). If you would like a more detailed
walk through with instructions for installing using different methods (in local virtual machines, EC2 and baremetal)
and different distributions, then have a look at the Install Guide.
This is to demonstrate installation and setting up of GlusterFS in under five minutes. You would not want to do this in
any real-world scenario.
Install glusterfs client and server packages:
Create 4 loopback devices to be consumed as bricks. This exercise is to simulate 4 hard disks in 4 different nodes.
Format the loopback devices with XFS filesystem and mount it.
# mount -a
# mkdir /mnt/test
# mount -t glusterfs `hostname`:test /mnt/test
For illustration, create 10 empty files and see how it gets distributed and replicated among the 4 bricks that make up
the volume.
3
GlusterFS Documentation, Release 3.8.0
# touch /mnt/test/file{1..10}
# ls /mnt/test
# tree /export/
(on both nodes): Note: These examples are going to assume the brick is going to reside on /dev/sdb1.
From “server1”
Note: When using hostnames, the first server needs to be probed from one other server to set its hostname.
From “server2”
Note: Once this pool has been established, only trusted members may probe new servers into the pool. A new server
cannot probe the pool, it must be probed from the pool.
Note: If the volume is not started, clues as to what went wrong will be in log files under /var/log/glusterfs on one or
both of the servers - usually in etc-glusterfs-glusterd.vol.log
For this step, we will use one of the servers to mount the volume. Typically, you would do this from an external
machine, known as a “client”. Since using this method would require additional packages to be installed on the client
machine, we will use one of the servers as a simple place to test first, as if it were that “client”.
# ls -lA /mnt | wc -l
You should see 100 files returned. Next, check the GlusterFS mount points on each server:
# ls -lA /data/brick1/gv0
You should see 100 files on each server using the method we listed here. Without replication, in a distribute only
volume (not detailed here), you should see about 50 files on each one.
Volume is the collection of bricks and most of the gluster file system operations happen on the volume. Gluster file
system supports different types of volumes based on the requirements. Some volumes are good for scaling storage
size, some for improving performance and some for both.
Distributed Volume - This is the default glusterfs volume i.e, while creating a volume if you do not specify the type
of the volume, the default option is to create a distributed volume. Here, files are distributed across various bricks in
the volume. So file1 may be stored only in brick1 or brick2 but not on both. Hence there is no data redundancy. The
purpose for such a storage volume is to easily & cheaply scale the volume size. However this also means that a brick
failure will lead to complete loss of data and one must rely on the underlying hardware for data loss protection.
7
GlusterFS Documentation, Release 3.8.0
Replicated Volume - In this volume we overcome the data loss problem faced in the distributed volume. Here exact
copies of the data are maintained on all bricks. The number of replicas in the volume can be decided by client while
creating the volume. So we need to have at least two bricks to create a volume with 2 replicas or a minimum of three
bricks to create a volume of 3 replicas. One major advantage of such a volume is that even if one brick fails the data
can still be accessed from its replicated bricks. Such a volume is used for better reliability and data redundancy.
Distributed Replicated Volume - In this volume files are distributed across replicated sets of bricks. The number of
bricks must be a multiple of the replica count. Also the order in which we specify the bricks matters since adjacent
bricks become replicas of each other. This type of volume is used when high availability of data due to redundancy and
scaling storage is required. So if there were eight bricks and replica count 2 then the first two bricks become replicas
of each other then the next two and so on. This volume is denoted as 4x2. Similarly if there were eight bricks and
replica count 4 then four bricks become replica of each other and we denote this volume as 2x4 volume.
Striped Volume - Consider a large file being stored in a brick which is frequently accessed by many clients at the
same time. This will cause too much load on a single brick and would reduce the performance. In striped volume the
data is stored in the bricks after dividing it into different stripes. So the large file will be divided into smaller chunks
(equal to the number of bricks in the volume) and each chunk is stored in a brick. Now the load is distributed and the
file can be fetched faster but no data redundancy provided.
Create a Striped Volume
#gluster volume create NEW-VOLNAME [stripe COUNT] [transport [tcp | dma | tcp,rdma]] NEW-BRICK...
For example, to create a striped volume across two storage servers:
Distributed Striped Volume - This is similar to Striped Glusterfs volume except that the stripes can now be distributed
across more number of bricks. However the number of bricks must be a multiple of the number of stripes. So if we
want to increase volume size we must add bricks in the multiple of stripe count.
Create the distributed striped volume:
gluster volume create NEW-VOLNAME [stripe COUNT] [transport [tcp | rdma | tcp,rdma]] NEW-BRICK...
For example, to create a distributed striped volume across eight storage servers:
˓→exp8
2.2 FUSE
GlusterFS is a userspace filesystem. This was a decision made by the GlusterFS developers initially as getting the
modules into linux kernel is a very long and difficult process.
Being a userspace filesystem, to interact with kernel VFS, GlusterFS makes use of FUSE (File System in Userspace).
For a long time, implementation of a userspace filesystem was considered impossible. FUSE was developed as a
solution for this. FUSE is a kernel module that support interaction between kernel VFS and non-privileged user
applications and it has an API that can be accessed from userspace. Using this API, any type of filesystem can be
written using almost any language you prefer as there are many bindings between FUSE and other languages.
Structural diagram of FUSE
This shows a filesystem “hello world” that is compiled to create a binary “hello”. It is executed with a filesystem mount
point /tmp/fuse. Then the user issues a command ls -l on the mount point /tmp/fuse. This command reaches VFS via
glibc and since the mount /tmp/fuse corresponds to a FUSE based filesystem, VFS passes it over to FUSE module.
The FUSE kernel module contacts the actual filesystem binary “hello” after passing through glibc and FUSE library
in userspace(libfuse). The result is returned by the “hello” through the same path and reaches the ls -l command.
2.2. FUSE 11
GlusterFS Documentation, Release 3.8.0
The communication between FUSE kernel module and the FUSE library(libfuse) is via a special file descriptor which
is obtained by opening /dev/fuse. This file can be opened multiple times, and the obtained file descriptor is passed to
the mount syscall, to match up the descriptor with the mounted filesystem.
• More about userspace filesystems
• FUSE reference
2.3 Translators
Translating “translators”:
• A translator converts requests from users into requests for storage.
One to one, one to many, one to zero (e.g. caching) .. figure:: ../_static/xlator.png
• A translator can modify requests on the way through :
– convert one request type to another ( during the request transfer amongst the translators)
– modify paths, flags, even data (e.g. encryption)
• Translators can intercept or block the requests. (e.g. access control)
• Or spawn new requests (e.g. pre-fetch)
How Do Translators Work?
• Translators are implemented as shared objects
• Dynamically loaded according to ‘volfile’
– dlopen/dlsync (setup pointers to parents/children)
– call init (constructor) (call IO functions through fops)
• There are conventions for validating/passing options, etc.
• The configuration of translators (since GlusterFS 3.1) is managed through the gluster command line interface
(cli), so you don’t need to know in what order to graph the translators together.
2.3. Translators 13
GlusterFS Documentation, Release 3.8.0
All the translators hooked together to perform a function is called a graph. The left-set of translators comprises of
Client-stack.The right-set of translators comprises of Server-stack.
The glusterfs translators can be sub-divided into many categories, but two important categories are - Cluster
and Performance translators :
One of the most important and the first translator the data/request has to go through is fuse translator which falls
under the category of Mount Translators.
Cluster Translators:
• DHT(Distributed Hash Table)
• AFR(Automatic File Replication)
Performance Translators:
• io-cache
• io-threads
• md-cache
• open behind
• quick read
• read-ahead
• readdir-ahead
• write-behind
Other Feature Translators include:
• changelog
• locks: provides internal locking operations called inodelk and entrylk
which are used by AFR to achieve synchronization of operations on files or directories that con-
flict with each other.
• marker
• quota
Debug Translators
• trace
• io-stats
What is DHT?
DHT is the real core of how GlusterFS aggregates capacity and performance across multiple servers. Its responsibility
is to place each file on exactly one of its subvolumes – unlike either replication (which places copies on all of its
subvolumes) or striping (which places pieces onto all of its subvolumes). It’s a routing function, not splitting or
copying.
How DHT works?
The basic method used in DHT is consistent hashing. Each subvolume (brick) is assigned a range within a 32-bit hash
space, covering the entire range with no holes or overlaps. Then each file is also assigned a value in that same space,
by hashing its name. Exactly one brick will have an assigned range including the file’s hash value, and so the file
“should” be on that brick. However, there are many cases where that won’t be the case, such as when the set of bricks
(and therefore the range assignment of ranges) has changed since the file was created, or when a brick is nearly full.
Much of the complexity in DHT involves these special cases, which we’ll discuss in a moment.
When you open() a file, the distribute translator is giving one piece of information to find your file, the file-name. To
determine where that file is, the translator runs the file-name through a hashing algorithm in order to turn that file-name
into a number.
A few Observations of DHT hash-values assignment:
1. The assignment of hash ranges to bricks is determined by extended attributes stored on directories, hence distri-
bution is directory-specific.
2. Consistent hashing is usually thought of as hashing around a circle, but in GlusterFS it’s more linear. There’s
no need to “wrap around” at zero, because there’s always a break (between one brick’s range and another’s) at
zero.
3. If a brick is missing, there will be a hole in the hash space. Even worse, if hash ranges are reassigned while
a brick is offline, some of the new ranges might overlap with the (now out of date) range stored on that brick,
creating a bit of confusion about where files should be.
The Automatic File Replication (AFR) translator in GlusterFS makes use of the extended attributes to keep track of
the file operations.It is responsible for replicating the data across the bricks.
Responsibilities of AFR
As soon as GlusterFS is installed in a server node, a gluster management daemon(glusterd) binary will be created. This
daemon should be running in all participating nodes in the cluster. After starting glusterd, a trusted server pool(TSP)
can be created consisting of all storage server nodes (TSP can contain even a single node). Now bricks which are the
basic units of storage can be created as export directories in these servers. Any number of bricks from this TSP can be
clubbed together to form a volume.
Once a volume is created, a glusterfsd process starts running in each of the participating brick. Along with this,
configuration files known as vol files will be generated inside /var/lib/glusterd/vols/. There will be configuration files
corresponding to each brick in the volume. This will contain all the details about that particular brick. Configuration
file required by a client process will also be created. Now our filesystem is ready to use. We can mount this volume
on a client machine very easily as follows and use it like we use a local storage:
2.3. Translators 15
GlusterFS Documentation, Release 3.8.0
IP or hostname can be that of any node in the trusted server pool in which the required volume is created.
When we mount the volume in the client, the client glusterfs process communicates with the servers’ glusterd process.
Server glusterd process sends a configuration file (vol file) containing the list of client translators and another contain-
ing the information of each brick in the volume with the help of which the client glusterfs process can now directly
communicate with each brick’s glusterfsd process. The setup is now complete and the volume is now ready for client’s
service.
When a system call (File operation or Fop) is issued by client in the mounted filesystem, the VFS (identifying the type
of filesystem to be glusterfs) will send the request to the FUSE kernel module. The FUSE kernel module will in turn
send it to the GlusterFS in the userspace of the client node via /dev/fuse (this has been described in FUSE section).
The GlusterFS process on the client consists of a stack of translators called the client translators which are defined in
the configuration file(vol file) send by the storage server glusterd process. The first among these translators being the
FUSE translator which consists of the FUSE library(libfuse). Each translator has got functions corresponding to each
file operation or fop supported by glusterfs. The request will hit the corresponding function in each of the translators.
Main client translators include:
• FUSE translator
• DHT translator- DHT translator maps the request to the correct brick that contains the file or directory required.
• AFR translator- It receives the request from the previous translator and if the volume type is replicate, it dupli-
cates the request and pass it on to the Protocol client translators of the replicas.
• Protocol Client translator- Protocol Client translator is the last in the client translator stack. This translator
is divided into multiple threads, one for each brick in the volume. This will directly communicate with the
glusterfsd of each brick.
In the storage server node that contains the brick in need, the request again goes through a series of translators known
as server translators, main ones being:
• Protocol server translator
• POSIX translator
The request will finally reach VFS and then will communicate with the underlying native filesystem. The response
will retrace the same path.
2.4 Geo-Replication
Geo-replication provides asynchronous replication of data across geographically distinct locations and was introduced
in Glusterfs 3.2. It mainly works across WAN and is used to replicate the entire volume unlike AFR which is intra-
cluster replication. This is mainly useful for backup of entire data for disaster recovery.
Geo-replication uses a master-slave model, whereby replication occurs between Master - a GlusterFS volume and
Slave - which can be a local directory or a GlusterFS volume. The slave (local directory or volume is accessed using
SSH tunnel).
Geo-replication provides an incremental replication service over Local Area Networks (LANs), Wide Area Network
(WANs), and across the Internet.
Geo-replication over LAN
You can configure Geo-replication to mirror data over a Local Area Network.
2.4. Geo-Replication 17
GlusterFS Documentation, Release 3.8.0
There are mainly two aspects while asynchronously replicating data ine change detection and replication:
Change detection - These include file-operation necessary details. There are two methods to sync the detected changes
namely a) changelogs and b) xsync:
a. Changelogs - Changelog is a translator which records necessary details for the fops that occur. The changes can be
written in binary format or ASCII. There are three category with each category represented by a specific changelog
format. All three types of categories are recorded in a single changelog file.
Entry - create(), mkdir(), mknod(), symlink(), link(), rename(), unlink(), rmdir()
Data - write(), writev(), truncate(), ftruncate()
Meta - setattr(), fsetattr(), setxattr(), fsetxattr(), removexattr(), fremovexattr()
In order to record the type of operation and entity underwent, a type identifier is used. Normally, the entity on which
the operation is performed would be identified by the pathname, but we choose to use GlusterFS internal file identifier
(GFID) instead (as GlusterFS supports GFID based backend and the pathname field may not always be valid and
other reasons which are out of scope of this this document). Therefore, the format of the record for the three types of
operation can be summarized as follows:
Entry - GFID + FOP + MODE + UID + GID + PARGFID/BNAME [PARGFID/BNAME]
Meta - GFID of the file
Data - GFID of the file
GFID’s are analogous to inodes. Data and Meta fops record the GFID of the entity on which the operation was
performed, thereby recording that there was a data/metadata change on the inode. Entry fops record at the minimum a
set of six or seven records (depending on the type of operation), that is sufficient to identify what type of operation the
entity underwent. Normally this record includes the GFID of the entity, the type of file operation (which is an integer
[an enumerated value which is used in Glusterfs]) and the parent GFID and the basename (analogous to parent inode
and basename).
Changelog file is rolled over after a specific time interval. We then perform processing operations on the file like
converting it to understandable/human readable format, keeping private copy of the changelog etc. The library then
consumes these logs and serves application requests.
b. Xsync - Marker translator maintains an extended attribute “xtime” for each file and directory. Whenever any update
happens it would update the xtime attribute of that file and all its ancestors. So the change is propagated from the node
(where the change has occurred) all the way to the root.
Consider the above directory tree structure. At time T1 the master and slave were in sync each other.
At time T2 a new file File2 was created. This will trigger the xtime marking (where xtime is the current timestamp)
from File2 upto to the root, i.e, the xtime of File2, Dir3, Dir1 and finally Dir0 all will be updated.
Geo-replication daemon crawls the file system based on the condition that xtime(master) > xtime(slave). Hence in our
example it would crawl only the left part of the directory structure since the right part of the directory structure still has
equal timestamp. Although the crawling algorithm is fast we still need to crawl a good part of the directory structure.
Replication - We use rsync for data replication. Rsync is an external utility which will calculate the diff of the two
files and sends this difference from source to sync.
2.5 Terminologies
Access Control Lists Access Control Lists (ACLs) allows you to assign different permissions for different users or
groups even though they do not correspond to the original owner or the owning group.
Brick Brick is the basic unit of storage, represented by an export directory on a server in the trusted storage pool.
2.5. Terminologies 19
GlusterFS Documentation, Release 3.8.0
Cluster A cluster is a group of linked computers, working together closely thus in many respects forming a single
computer.
Distributed File System A file system that allows multiple clients to concurrently access data over a computer net-
work
FUSE Filesystem in Userspace (FUSE) is a loadable kernel module for Unix-like computer operating systems that
lets non-privileged users create their own file systems without editing kernel code. This is achieved by running
file system code in user space while the FUSE module provides only a “bridge” to the actual kernel interfaces.
glusterd Gluster management daemon that needs to run on all servers in the trusted storage pool.
Geo-Replication Geo-replication provides a continuous, asynchronous, and incremental replication service from site
to another over Local Area Networks (LANs), Wide Area Network (WANs), and across the Internet.
Metadata Metadata is defined as data providing information about one or more other pieces of data.There is no
special metadata storage concept in GlusterFS. The metadata is stored with the file data itself.
Namespace Namespace is an abstract container or environment created to hold a logical grouping of unique identifiers
or symbols. Each Gluster volume exposes a single namespace as a POSIX mount point that contains every file
in the cluster.
POSIX Portable Operating System Interface [for Unix] is the name of a family of related standards specified by the
IEEE to define the application programming interface (API), along with shell and utilities interfaces for software
compatible with variants of the Unix operating system. Gluster exports a fully POSIX compliant file system.
RAID Redundant Array of Inexpensive Disks”, is a technology that provides increased storage reliability through
redundancy, combining multiple low-cost, less-reliable disk drives components into a logical unit where all
drives in the array are interdependent.
RRDNS Round Robin Domain Name Service (RRDNS) is a method to distribute load across application servers. It
is implemented by creating multiple A records with the same name and different IP addresses in the zone file of
a DNS server.
Trusted Storage Pool A storage pool is a trusted network of storage servers. When you start the first server, the
storage pool consists of that server alone.
Userspace Applications running in user space don’t directly interact with hardware, instead using the kernel to mod-
erate access. Userspace applications are generally more portable than applications in kernel space. Gluster is a
user space application.
Volume A volume is a logical collection of bricks. Most of the gluster management operations happen on the volume.
Vol file .vol files are configuration files used by glusterfs process. Volfiles will be usually located at
/var/lib/glusterd/vols/volume-name/. Eg:vol-name-fuse.vol,export-brick-name.vol,etc.. Sub-volumes in the .vol
files are present in the bottom-up approach and then after tracing forms a tree structure, where in the hierarchy
last comes the client volumes.
Client The machine which mounts the volume (this may also be a server).
Server The machine which hosts the actual file system in which the data will be stored.
Replicate Replicate is generally done to make a redundancy of the storage for data availability.
2.5. Terminologies 21
GlusterFS Documentation, Release 3.8.0
Installation Guide
This tutorial will cover different options for getting a Gluster cluster up and running. Here is a rundown of the steps
we need to do.
To start, we will go over some common things you will need to know for setting up Gluster.
Next, choose the method you want to use to set up your first cluster: - Within a virtual machine - To bare metal servers
- To EC2 instances in Amazon
Finally, we will install Gluster, create a few volumes, and test using them.
No matter where you will be installing Gluster, it helps to understand a few key concepts on what the moving parts
are.
First, it is important to understand that GlusterFS isn’t really a filesystem in and of itself. It concatenates existing
filesystems into one (or more) big chunks so that data being written into or read out of Gluster gets distributed across
multiple hosts simultaneously. This means that you can use space from any host that you have available. Typically,
XFS is recommended but it can be used with other filesystems as well. Most commonly EXT4 is used when XFS
isn’t, but you can (and many, many people do) use another filesystem that suits you. Now that we understand that, we
can define a few of the common terms used in Gluster.
• A trusted pool refers collectively to the hosts in a given Gluster Cluster.
• A node or “server” refers to any server that is part of a trusted pool. In general, this assumes all nodes are in the
same trusted pool.
• A brick is used to refer to any device (really this means filesystem) that is being used for Gluster storage.
• An export refers to the mount path of the brick(s) on a given server, for example, /export/brick1
• The term Global Namespace is a fancy way of saying a Gluster volume
• A Gluster volume is a collection of one or more bricks (of course, typically this is two or more). This is
analagous to /etc/exports entries for NFS.
• GNFS and kNFS. GNFS is how we refer to our inline NFS server. kNFS stands for kernel NFS, or, as most
people would say, just plain NFS. Most often, you will want kNFS services disabled on the Gluster nodes.
Gluster NFS doesn’t take any additional configuration and works just like you would expect with NFSv3. It is
possible to configure Gluster and NFS to live in harmony if you want to.
Other notes:
23
GlusterFS Documentation, Release 3.8.0
• For this test, if you do not have DNS set up, you can get away with using /etc/hosts entries for the two nodes.
However, when you move from this basic setup to using Gluster in production, correct DNS entries (forward
and reverse) and NTP are essential.
• When you install the Operating System, do not format the Gluster storage disks! We will use specific settings
with the mkfs command later on when we set up Gluster. If you are testing with a single disk (not recommended),
make sure to carve out a free partition or two to be used by Gluster later, so that you can format or reformat at
will during your testing.
• Firewalls are great, except when they aren’t. For storage servers, being able to operate in a trusted environment
without firewalls can mean huge gains in performance, and is recommended.
3.2 Configuration
For the Gluster to communicate within a cluster either the firewalls have to be turned off or enable communication for
each server.
Remember that the trusted pool is the term used to define a cluster of nodes in Gluster. Choose a server to be your
“primary” server. This is just to keep things simple, you will generally want to run all commands from this tutorial.
Keep in mind, running many Gluster specific commands (like ‘gluster volume create‘) on one server in the cluster will
execute the same command on all other servers.
gluster peer probe (hostname of the other server in the cluster, or IP address if you don’t have DNS
˓→etc/hosts entries)
Notice that running ‘gluster peer status‘ from the second node shows that the first node has already been added.
The most basic Gluster volume type is a “Distribute only” volume (also referred to as a “pure DHT” volume if you
want to impress the folks at the water cooler). This type of volume simply distributes the data evenly across the
available bricks in a volume. So, if I write 100 files, on average, fifty will end up on one server, and fifty will end up
on another. This is faster than a “replicated” volume, but isn’t as popular since it doesn’t give you two of the most
sought after features of Gluster — multiple copies of the data, and automatic failover if something goes wrong. To set
up a replicated volume:
Breaking this down into pieces, the first part says to create a gluster volume named gv0 (the name is arbitrary, gv0 was
chosen simply because it’s less typing than gluster_volume_0). Next, we tell it to make the volume a replica volume,
and to keep a copy of the data on at least 2 bricks at any given time. Since we only have two bricks total, this means
each server will house a copy of the data. Lastly, we specify which nodes to use, and which bricks on those nodes. The
order here is important when you have more bricks. . . it is possible (as of the most current release as of this writing,
Gluster 3.3) to specify the bricks in a such a way that you would make both copies of the data reside on a single node.
This would make for an embarrassing explanation to your boss when your bulletproof, completely redundant, always
on super cluster comes to a grinding halt when a single point of failure occurs.
Now, we can check to make sure things are working as expected:
This shows us essentially what we just specified during the volume creation. The one this to mention is the “Status”.
A status of “Created” means that the volume has been created, but hasn’t yet been started, which would cause any
attempt to mount the volume fail.
Now, we should start the volume.
3.2. Configuration 25
GlusterFS Documentation, Release 3.8.0
For RPM based distributions, if you will be using InfiniBand, add the glusterfs RDMA package to the installations. For
RPM based systems, yum is used as the install method in order to satisfy external depencies such as compat-readline5
For Debian
Download the packages
(Note from reader: The above does not work. Check http://download.gluster.org/pub/gluster/glusterfs/LATEST/
Debian/ for 3.5 version or use http://packages.debian.org/wheezy/glusterfs-server
Install the Gluster packages (do this on both servers)
dpkg -i glusterfs_3.5.2-4_amd64.deb
For Ubuntu
Ubuntu 10 and 12: install python-software-properties:
Note: Packages exist for Ubuntu 12.04 LTS, 12.10, 13.10, and 14.04 LTS
For Red Hat/CentOS
Download the packages (modify URL for your specific release and architecture).
For Fedora
Install the Gluster packages (do this on both servers)
Once you are finished installing, you can move on to configuration section.
3.4 Overview
This document will get you up to speed with some hands-on experience with Gluster by guiding you through the steps
of setting it up for the first time. If you are looking to get right into things, you are in the right place. If you want just
the bare minimum steps, see the Quick Start Guide.
If you want some in-depth information on each of the steps, you are in the right place. Both the guides will get you to
a working Gluster cluster, so it depends on you how much time you want to spend. The Quick Start Guide should have
you up and running in ten minutes or less. This guide can easily be done in a lunch break, and still gives you time to
have a quick bite to eat. The Getting Started guide can be done easily in a few hours, depending on how much testing
you want to do.
After you deploy Gluster by following these steps, we recommend that you read the Gluster Admin Guide to learn
how to administer Gluster and how to select a volume type that fits your needs. Also, be sure to enlist the help of the
Gluster community via the IRC channel or Q&A section . We want you to be successful in as short a time as possible.
Overview:
Before we begin, let’s talk about what Gluster is, dispel a few myths and misconceptions, and define a few terms. This
will help you to avoid some of the common issues that others encounter most frequently.
Gluster is a distributed scale out filesystem that allows rapid provisioning of additional storage based on your storage
consumption needs. It incorporates automatic failover as a primary feature. All of this is accomplished without a
centralized metadata server.
• Gluster is an easy way to provision your own storage backend NAS using almost any hardware you choose.
• You can add as much as you want to start with, and if you need more later, adding more takes just a few steps.
• You can configure failover automatically, so that if a server goes down, you don’t lose access to the data. No
manual steps are required for failover. When you fix the server that failed and bring it back online, you don’t
have to do anything to get the data back except wait. In the mean time, the most current copy of your data keeps
getting served from the node that was still running.
• You can build a clustered filesystem in a matter of minutes. . . it is trivially easy for basic setups
• It takes advantage of what we refer to as “commodity hardware”, which means, we run on just about any
hardware you can think of, from that stack of decomm’s and gigabit switches in the corner no one can figure out
what to do with (how many license servers do you really need, after all?), to that dream array you were speccing
out online. Don’t worry, I won’t tell your boss.
• It takes advantage of commodity software too. No need to mess with kernels or fine tune the OS to a tee. We
run on top of most unix filesystems, with XFS and ext4 being the most popular choices. We do have some
recommendations for more heavily utilized arrays, but these are simple to implement and you probably have
some of these configured already anyway.
• Gluster data can be accessed from just about anywhere – You can use traditional NFS, SMB/CIFS for Windows
clients, or our own native GlusterFS (a few additional packages are needed on the client machines for this, but
as you will see, they are quite small).
• There are even more advanced features than this, but for now we will focus on the basics.
3.4. Overview 27
GlusterFS Documentation, Release 3.8.0
• It’s not just a toy. Gluster is enterprise ready, and commercial support is available if you need it. It is used in
some of the most taxing environments like media serving, natural resource exploration, medical imaging, and
even as a filesystem for Big Data.
Most likely, yes. People use Gluster for all sorts of things. You are encouraged to ask around in our IRC channel
or Q&A forums to see if anyone has tried something similar. That being said, there are a few places where Gluster
is going to need more consideration than others. - Accessing Gluster from SMB/CIFS is often going to be slow by
most people’s standards. If you only moderate access by users, then it most likely won’t be an issue for you. On
the other hand, adding enough Gluster servers into the mix, some people have seen better performance with us than
other solutions due to the scale out nature of the technology - Gluster does not support so called “structured data”,
meaning live, SQL databases. Of course, using Gluster to backup and restore the database would be fine - Gluster is
traditionally better when using file sizes at of least 16KB (with a sweet spot around 128KB or so).
3.4.4 How many billions of dollars is it going to cost to setup a cluster? Don’t I
need redundant networking, super fast SSD’s, technology from Alpha Cen-
tauri delivered by men in black, etc. . . ?
I have never seen anyone spend even close to a billion, unless they got the rust proof coating on the servers. You don’t
seem like the type that would get bamboozled like that, so have no fear. For purpose of this tutorial, if your laptop can
run two VM’s with 1GB of memory each, you can get started testing and the only thing you are going to pay for is
coffee (assuming the coffee shop doesn’t make you pay them back for the electricity to power your laptop).
If you want to test on bare metal, since Gluster is built with commodity hardware in mind, and because there is no
centralized meta-data server, a very simple cluster can be deployed with two basic servers (2 CPU’s, 4GB of RAM
each, 1 Gigabit network). This is sufficient to have a nice file share or a place to put some nightly backups. Gluster
is deployed successfully on all kinds of disks, from the lowliest 5200 RPM SATA to mightiest 1.21 gigawatt SSD’s.
The more performance you need, the more consideration you will want to put into how much hardware to buy, but the
great thing about Gluster is that you can start small, and add on as your needs grow.
3.4.5 OK, but if I add servers on later, don’t they have to be exactly the same?
In a perfect world, sure. Having the hardware be the same means less troubleshooting when the fires start popping up.
But plenty of people deploy Gluster on mix and match hardware, and successfully.
Get started by checking some Common Criteria
Here are the bare minimum steps you need to get Gluster up and running. If this is your first time setting things up, it
is recommended that you choose your desired setup method i.e. AWS, virtual machines or baremetal after step four,
adding an entry to fstab. If you want to get right to it (and don’t need more information), the steps below are all you
need to get started.
You will need to have at least two nodes with a 64 bit OS and a working network connection. At least one gig of RAM
is the bare minimum recommended for testing, and you will want at least 8GB in any system you plan on doing any
real work on. A single cpu is fine for testing, as long as it is 64 bit.
Partition, Format and mount the bricks
Assuming you have a brick at /dev/sdb:
Note: You only need one of the three setup methods! ### Setup, Method 2 – Setting up on physical servers
To set up Gluster on physical servers, I recommend two servers of very modest specifications (2 CPU’s, 2GB of RAM,
1GBE). Since we are dealing with physical hardware here, keep in mind, what we are showing here is for testing
purposes. In the end, remember that forces beyond your control (aka, your bosses’ boss...) can force you to take
that the “just for a quick test” envinronment right into production, despite your kicking and screaming against it. To
prevent this, it can be a good idea to deploy your test environment as much as possible the same way you would to a
production environment (in case it becomes one, as mentioned above). That being said, here is a reminder of some of
the best practices we mentioned before:
• Make sure DNS and NTP are setup, correct, and working
• If you have access to a backend storage network, use it! 10GBE or InfiniBand are great if you have access to
them, but even a 1GBE backbone can help you get the most out of you deployment. Make sure that the interfaces
you are going to use are also in DNS since we will be using the hostnames when we deploy Gluster
• When it comes to disks, the more the merrier. Although you could technically fake things out with a single disk,
there would be performance issues as soon as you tried to do any real work on the servers
With the explosion of commodity hardware, you don’t need to be a hardware expert these days to deploy a server.
Although this is generally a good thing, it also means that paying attention to some important, performance impacting
BIOS settings is commonly ignored. A few things I have seen cause issues when people didn’t know to look for them:
• Most manufacturers enable power saving mode by default. This is a great idea for servers that do not have high
performance requirements. For the average storage server, the performance impact of the power savings is not a
reasonable trade off
• Newer motherboards and processors have lots of nifty features! Enhancements in virtualization, newer ways of
doing predictive algorithms and NUMA are just a few to mention. To be safe, many manufactures ship hardware
with settings meant to work with as massive a variety of workloads and configurations as they have customers.
One issue you could face is when you set up that blazing fast 10GBE card you were so thrilled about installing?
In many cases, it would end up being crippled by a default 1x speed put in place on the PCI-E bus by the
motherboard.
Thankfully, most manufactures show all the BIOS settings, including the defaults, right in the manual. It only takes a
few minutes to download, and you don’t even have to power off the server unless you need to make changes. More and
more boards include the functionality to make changes in the BIOS on the fly without even powering the box off. One
word of caution of course, is don’t go too crazy. Fretting over each tiny little detail and setting is usually not worth
the time, and the more changes you make, the more you need to document and implement later. Try to find the happy
balance between time spent managing the hardware (which ideally should be as close to zero after you setup initially)
and the expected gains you get back from it.
Finally, remember that some hardware really is better that others. Without pointing fingers anywhere specifically, it
is often true that onboard components are not as robust as add-ons. As a general rule, you can safely delegate the
on-board hardware to things like management network for the NIC’s, and for installing the OS onto a SATA drive. At
least twice a year you should check the manufacturers website for bulletins about your hardware. Critical performance
issues are often resolved with a simple driver or firmware update. As often as not, these updates affect the two most
critical pieces of hardware on a machine you want to use for networked storage - the RAID controller and the NIC’s.
Once you have setup the servers and installed the OS, you are ready to move on to the install section.
Deploying in Amazon can be one of the fastest ways to get up and running with Gluster. Of course, most of what we
cover here will work with other cloud platforms.
• Deploy at least two instances. For testing, you can use micro instances (I even go as far as using spot instances
in most cases). Debates rage on what size instance to use in production, and there is really no correct answer.
As with most things, the real answer is “whatever works for you”, where the trade-offs betweeen cost and
performance are balanced in a continual dance of trying to make your project successful while making sure
there is enough money left over in the budget for you to get that sweet new ping pong table in the break room.
• For cloud platforms, your data is wide open right from the start. As such, you shouldn’t allow open access to
all ports in your security groups if you plan to put a single piece of even the least valuable information on the
test instances. By least valuable, I mean “Cash value of this coupon is 1/100th of 1 cent” kind of least valuable.
Don’t be the next one to end up as a breaking news flash on the latest inconsiderate company to allow their data
to fall into the hands of the baddies. See Step 2 for the minimum ports you will need open to use Gluster
• You can use the free “ephemeral” storage for the Gluster bricks during testing, but make sure to use some form
of protection against data loss when you move to production. Typically this means EBS backed volumes or
using S3 to periodically back up your data bricks.
Other notes:
• In production, it is recommended to replicate your VM’s across multiple zones. For purpose of this tutorial, it
is overkill, but if anyone is interested in this please let us know since we are always looking to write articles on
As we just mentioned, to set up Gluster using virtual machines, you will need at least two virtual machines with at
least 1GB of RAM each. You may be able to test with less but most users will find it too slow for their tastes. The
particular virtualization product you use is a matter of choice. Platforms I have used to test on include Xen, VMware
ESX and Workstation, VirtualBox, and KVM. For purpose of this article, all steps assume KVM but the concepts are
expected to be simple to translate to other platforms as well. The article assumes you know the particulars of how to
create a virtual machine and have installed a 64 bit linux distribution already.
Create or clone two VM’s, with the following setup on each:
• 2 disks using the VirtIO driver, one for the base OS and one that we will use as a Gluster “brick”. You can add
more later to try testing some more advanced configurations, but for now let’s keep it simple.
Note: If you have ample space available, consider allocating all the disk space at once.
• 2 NIC’s using VirtIO driver. The second NIC is not strictly required, but can be used to demonstrate setting up
a seperate network for storage and management traffic.
Note: Attach each NIC to a seperate network.
Other notes: Make sure that if you clone the VM, that Gluster has not already been installed. Gluster generates a
UUID to “fingerprint” each system, so cloning a previously deployed system will result in errors later on.
Once these are prepared, you are ready to move on to the install section.
Administrator Guide
33
GlusterFS Documentation, Release 3.8.0
Upgrade Guide
35
GlusterFS Documentation, Release 3.8.0
Contributors Guide
As a developer/user, you have blogged about gluster and want to share the post to Gluster community.
Ok, you can do that by editing this file. https://github.com/gluster/planet-gluster/blob/master/data/feeds.yml
Please find instructions mentioned in the file and send a pull request.
Once approved, all your gluster related posts will appear in planet.gluster.org
37
GlusterFS Documentation, Release 3.8.0
– CLOSED/NEXTRELEASE when backport of the code change that fixes the reported mainline problem
has been merged.
– CLOSED/CURRENTRELEASE when the code change that fixes the reported bug has been merged in
the corresponding release-branch and a version is released with the fix.
– CLOSED/WONTFIX when the reported problem or suggestion is valid, but any fix of the reported
problem or implementation of the suggestion would be barred from approval by the project’s Develop-
ers/Maintainers (or product managers, if existing).
– CLOSED/WORKSFORME when the problem can not be reproduced, when missing information has not
been provided, or when an acceptable workaround exists to achieve a similar outcome as requested.
– CLOSED/CANTFIX when the problem is not a bug, or when it is a change that is outside the power of
GlusterFS development. For example, bugs proposing changes to third-party software can not be fixed in
the GlusterFS project itself.
– CLOSED/DUPLICATE when the problem has been reported before, no matter if the previous report has
been already resolved or not.
If a bug report was marked as CLOSED or VERIFIED and it turns out that this was incorrect, the bug can be changed
to the status ASSIGNED or NEW.
* please clone the existing bug for the release, you found the issue.
* If the existing bug is against mainline and you found the issue for a release, then the cloned bug
depends on should be set to the BZ for mainline bug.
Anyone can search in Bugzilla, you don’t need an account. Searching requires some effort, but helps avoid duplicates,
and you may find that your problem has already been solved.
Note: Please go through all below sections to understand what information we need to put in a bug. So it will help the
developer to root cause and fix it
Required Information
You should gather the information below before creating the bug report.
Package Information
Cluster Information
Volume Information
• Number of volumes
• Volume Names
• Volume on which the particular issue is seen [ if applicable ]
• Type of volumes
• Volume options if available
• Output of gluster volume info
• Output of gluster volume status
• Get the statedump of the volume with the problem
$ gluster volume statedump
This dumps statedump per brick process in /var/run/gluster
NOTE: Collect statedumps from one gluster Node in a directory.
Repeat it in all Nodes containing the bricks of the volume. All the so collected directories could be
archived,compressed and attached to bug
Brick Information
Client Information
$ uname -r
$ cat /etc/issue
• Fuse or NFS Mount point on the client with output of mount commands
• Output of df -Th command
Tool Information
• If any tools are used for testing, provide the info/version about it
• if any IO is simulated using a script, provide the script
Logs Information
• Get the sosreport from the involved gluster Node and Client [ in case of CentOS /Fedora ]
• Add a meaningful name/IP to the sosreport, by renaming/adding hostname/ip to the sosreport name
• Triaging of bugs is an important task; when done correctly, it can reduce the time between reporting a bug and
the availability of a fix enormously.
• Triager should focus on new bugs, and try to define the problem easily understandable and as accurate as
possible. The goal of the triagers is to reduce the time that developers need to solve the bug report.
• A triager is like an assistant that helps with the information gathering and possibly the debugging of a new bug
report. Because a triager helps preparing a bug before a developer gets involved, it can be a very nice role for
new community members that are interested in technical aspects of the software.
• Triagers will stumble upon many different kind of issues, ranging from reports about spelling mistakes, or
unclear log messages to memory leaks causing crashes or performance issues in environments with several
hundred storage servers.
Nobody expects that triagers can prepare all bug reports. Therefore most developers will be able to assist the triagers,
answer questions and suggest approaches to debug and data to gather. Over time, triagers get more experienced and
will rely less on developers.
Bug triage can be summarised as below points:
• Is there enough information in the bug description?
• Is it a duplicate bug?
• Is it assigned to correct component of GlusterFS?
• Are the Bugzilla fields correct?
• Is the bug summary is correct?
• Assigning bugs or Adding people to the “CC” list
• Fix the Severity And Priority.
• Todo, If the bug present in multiple GlusterFS versions.
• Add appropriate Keywords to bug.
The detailed discussion about the above points are below.
We try to meet every week in #gluster-meeting on Freenode. The meeting date and time for the next meeting is
normally updated in the agenda.
There are many different techniques and approaches to find reports to triage. One easy way is to use these pre-defined
Bugzilla reports (a report is completely structured in the URL and can manually be modified):
• New bugs that do not have the ‘Triaged’ keyword Bugzilla link
• New features that do not have the ‘Triaged’ keyword (identified by FutureFeature keyword, probably of interest
only to project leaders) Bugzilla link
• New glusterd bugs: Bugzilla link
• New Replication(afr) bugs: Bugzilla link
• New distribute(DHT) bugs: Bugzilla links
• New bugs against version 3.6: Buzilla link
• Bugzilla tracker (can include already Triaged bugs)
• Untriaged NetBSD bugs
• Untriaged FreeBSD bugs
• Untriaged Mac OS bugs
In addition to manually checking Bugzilla for bugs to triage, it is also possible to receive emails when new bugs are
filed or existing bugs get updated.
If at any point you feel like you do not know what to do with a certain report, please first ask irc or mailing lists before
changing something.
To make a report useful, the same rules apply as for bug reporting guidelines.
It’s hard to generalize what makes a good report. For “average” reporters is definitely often helpful to have good
steps to reproduce, GlusterFS software version , and information about the test/production environment, Linux/GNU
distribution.
If the reporter is a developer, steps to reproduce can sometimes be omitted as context is obvious. However, this can
create a problem for contributors that need to find their way, hence it is strongly advised to list the steps to reproduce
an issue.
Other tips:
• There should be only one issue per report. Try not to mix related or similar looking bugs per report.
• It should be possible to call the described problem fixed at some point. “Improve the documentation” or “It
runs slow” could never be called fixed, while “Documentation should cover the topic Embedding” or “The page
at http://en.wikipedia.org/wiki/Example should load in less than five seconds” would have a criterion. A good
summary of the bug will also help others in finding existing bugs and prevent filing of duplicates.
• If the bug is a graphical problem, you may want to ask for a screenshot to attach to the bug report. Make sure to
ask that the screenshot should not contain any confidential information.
6.4.4 Is it a duplicate?
Some reports in Bugzilla have already been reported before so you can search for an already existing report. We
do not recommend to spend too much time on it; if a bug is filed twice, someone else will mark it as a duplicate
later. If the bug is a duplicate, mark it as a duplicate in the resolution box below the comment field by setting the
CLOSED DUPLICATE status, and shortly explain your action in a comment for the reporter. When marking a bug
as a duplicate, it is required to reference the original bug.
If you think that you have found a duplicate but you are not totally sure, just add a comment like “This bug looks related
to bug XXXXX” (and replace XXXXX by the bug number) so somebody else can take a look and help judging.
Make sure the bug is assigned on right component. Below are the list of GlusterFs components in bugzilla.
• access control - Access control translator
• BDB - Berkeley DB backend storage
• booster - LD_PRELOAD’able access client
• build - Compiler, package management and platform specific warnings and errors
• cli -gluster command line
• core - Core features of the filesystem
• distribute - Distribute translator (previously DHT)
• errorgen - Error Gen Translator
• fuse -mount/fuse translator and patched fuse library
• georeplication - Gluster Geo-Replication
• glusterd - Management daemon
• HDFS - Hadoop application support over GlusterFS
• ib-verbs - Infiniband verbs transport
• io-cache - IO buffer caching translator
• io-threads - IO threads performance translator
• libglusterfsclient- API interface to access glusterfs volumes programatically
• locks - POSIX and internal locks
• logging - Centralized logging, log messages, log rotation etc
• nfs- NFS component in GlusterFS
• nufa- Non-Uniform Filesystem Scheduler Translator
• object-storage - Object Storage
• porting - Porting GlusterFS to different operating systems and platforms
• posix - POSIX (API) based backend storage
• protocol -Client and Server protocol translators
• quick-read- Quick Read Translator
• quota - Volume & Directory quota translator
• rdma- RDMA transport
• read-ahead - Read ahead (file) performance translator
• replicate- Replication translator (previously AFR)
• rpc - RPC Layer
• scripts - Build scripts, mount scripts, etc.
• stat-prefetch - Stat prefetch translator
• stripe - Striping (RAID-0) cluster translator
• trace- Trace translator
• transport - Socket (IPv4, IPv6, unix, ib-sdp) and generic transport code
• unclassified - Unclassified - to be reclassified as other components
• unify - Unify translator and schedulers
• write-behind- Write behind performance translator
• libgfapi - APIs for GlusterFS
• tests- GlusterFS Test Framework
• gluster-hadoop - Hadoop support on GlusterFS
• gluster-hadoop-install - Automated Gluster volume configuration for Hadoop Environments
• gluster-smb - gluster smb
• puppet-gluster - A puppet module for GlusterFS
Tips for searching:
• As it is often hard for reporters to find the right place (product and component) where to file a report, also search
for duplicates outside same product and component of the bug report you are triaging.
• Use common words and try several times with different combinations, as there could be several ways to describe
the same problem. If you choose the proper and common words, and you try several times with different
combinations of those, you ensure to have matching results.
• Drop the ending of a verb (e.g. search for “delet” so you get reports for both “delete” and “deleting”), and also
try similar words (e.g. search both for “delet” and “remov”).
• Search using the date range delimiter: Most of the bug reports are recent, so you can try to increase the search
speed using date delimiters by going to “Search by Change History” on the search page. Example: search from
“2011-01-01” or “-730d” (to cover the last two years) to “Now”.
Summary
Sometimes the summary does not summarize the bug itself well. You may want to update the bug summary to make
the report distinguishable. A good title may contain:
• A brief explanation of the root cause (if it was found)
• Some of the symptoms people are experiencing
Normally, developers and potential assignees of an area are already CC’ed by default, but sometimes reports describe
general issues or are filed against common bugzilla products. Only if you know developers who work in the area
covered by the bug report, and if you know that these developers accept getting CCed or assigned to certain reports,
you can add that person to the CC field or even assign the bug report to her/him.
To get an idea who works in which area, check To know component owners , you can check the “MAINTAINERS”
file in root of glusterfs code directory or querying changes in Gerrit (see Simplified dev workflow)
Please see below for information on the available values and their meanings.
Severity
This field is a pull-down of the external weighting of the bug report’s importance and can have the following values:
Sever- Definition
ity
urgent catastrophic issues which severely impact the mission-critical operations of an organization. This may
mean that the operational servers, development systems or customer applications are down or not
functioning and no procedural workaround exists.
high high-impact issues in which the customer’s operation is disrupted, but there is some capacity to produce
medium partial non-critical functionality loss, or issues which impair some operations but allow the customer to
perform their critical tasks. This may be a minor issue with limited loss or no loss of functionality and
limited impact to the customer’s functionality
low general usage questions, recommendations for product enhancement, or development work
un- importance not specified
speci-
fied
Priority
This field is a pull-down of the internal weighting of the bug report’s importance and can have the following values:
Priority Definition
urgent extremely important
high very important
medium average importance
low not very important
unspecified importance not specified
During triaging you might come across a particular bug which is present across multiple version of GlusterFS. Here
are the course of actions:
• We should have separate bugs for each release (We should clone bugs if required)
• Bugs in released versions should be depended on bug for mainline (master branch) if the bug is applicable for
mainline.
– This will make sure that the fix would get merged in master branch first then the fix can get ported to other
stable releases.
Note: When a bug depends on other bugs, that means the bug cannot be fixed unless other bugs are fixed (depends
on), or this bug stops other bugs being fixed (blocks)
Here are some examples:
• A bug is raised for GlusterFS 3.5 and the same issue is present in mainline (master branch) and GlusterFS 3.6
– Clone the original bug for mainline.
– Clone another for 3.6.
– And have the GlusterFS 3.6 bug and GlusterFS 3.5 bug ‘depend on’ the ‘mainline’ bug
• A bug is already present for mainline, and the same issue is seen in GlusterFS 3.5.
– Clone the original bug for GlusterFS 3.5.
– And have the cloned bug (for 3.5) ‘depend on’ the ‘mainline’ bug.
Keywords
Many predefined searches for Bugzilla include keywords. One example are the searches for the triaging. If the bug is
‘NEW’ and ‘Triaged’ is no set, you (as a triager) can pick it and use this page to triage it. When the bug is ‘NEW’ and
‘Triaged’ is in the list of keyword, the bug is ready to be picked up by a developer.
Triaged Once you are done with triage add the Triaged keyword to the bug, so that others will know the triaged state
of the bug. The predefined search at the top of this page will then not list the Triaged bug anymore. Instead, the
bug should have moved to this list.
EasyFix By adding the EasyFix keyword, the bug gets added to the list of bugs that should be simple to fix. Adding
this keyword is encouraged for simple and well defined bugs or feature enhancements.
Patch When a patch for the problem has been attached or included inline, add the Patch keyword so that it is clear
that some preparation for the development has been done already. If course, it would have been nicer if the patch
was sent to Gerrit for review, but not everyone is ready to pass the Gerrit hurdle when they report a bug.
You can also add the Patch keyword when a bug has been fixed in mainline and the patch(es) has been identified. Add
a link to the Gerrit change(s) so that backporting to a stable release is made simpler.
Documentation Add the Documentation keyword when a bug has been reported for the documentation. This helps
editors and writers in finding the bugs that they can resolve.
Tracking This keyword is used for bugs which are used to track other bugs for a particular release. For example 3.6
tracker bug
FutureFeature This keyword is used for bugs which are used to request for a feature enhancement ( RFE - Requested
Feature Enhancement) for future releases of GlusterFS. If you open a bug by requesting a feature which you
would like to see in next versions of GlusterFS please report with this keyword.
By adding yourself to the CC list of bug reports that you change, you will receive followup emails with all comments
and changes by anybody on that individual report. This helps learning what further investigations others make. You
can change the settings in Bugzilla on which actions you want to receive mail.
If you come across a bug/ bugs or If you think any bug should to go thorough the bug triage group, please set NEED-
INFO for [email protected] on the bug.
See the Bug report life cycle for the meaning of the bug status and resolutions.
Changelog
47
GlusterFS Documentation, Release 3.8.0
48 Chapter 7. Changelog
CHAPTER 8
Presentations
This is a collection of Gluster presentations from all over the world. We have a slideshare account where most of these
presentations are stored.
GlusterFS Meetup Bangalore @ Bangalore, India (June 4, 2016)
• GlusterFS Containers - Mohamed Ashiq Liyazudeen, Humble Devassy Chirammal
• gdeploy 2.0 - Sachidananda Urs, Nandaja Varma
49
GlusterFS Documentation, Release 3.8.0
Ceph & Gluster FS - Software Defined Storage Meetup (January 22, 2015)
• GlusterFS - Current Features & Roadmap - Niels de Vos, Red Hat
Open source storage for bigdata :Fifth Elephant event (June 21, 2014)
• GlusterFS_Hadoop_fifth-elephant.odp - Lalatendu Mohanty, Red Hat
50 Chapter 8. Presentations
GlusterFS Documentation, Release 3.8.0
Red Hat Summit 2014, San Francisco, California, USA (April 14-17, 2014)
• Red Hat Storage Server Administration Deep Dive - Dustin Black, Red Hat
– GlusterFS Stack Diagram
Gluster Community Day / LinuxCon Europ 2013, Edinburgh, United Kingdom (October 22nd, 2013)
• GlusterFS Architecture & Roadmap - Vijay Bellur
• Integrating GlusterFS, QEMU and oVirt - Vijay Bellur
51
GlusterFS Documentation, Release 3.8.0
52 Chapter 8. Presentations