GIS Training - Managing Distributed Data
GIS Training - Managing Distributed Data
2
Managing
Distributed
Data
DISCLAIMER STATEMENT
Information presented in this training document may be considered public information and may be reproduced, distributed or
copied freely, exclusively for non-commercial purposes and unless identified as being subject to third parties’ copyright
protection. World Food Programme (“WFP”) should be cited as the source of the information and any photo credits should be
similarly credited to the author or source, as appropriate. If a third party copyright is indicated on a photo, graphic, or on any
other material, permission to copy these materials must be obtained from the original source.
WFP declines all responsibility for errors or deficiencies in this training document or in the documentation accompanying it,
for the update of the data as well as for any damage that may arise from them. Information in the GIS Training Package is
provided without warranty of any kind, either express or implied, including, without limitation, warranties as to fitness for a
specific purpose, non-infringement, accuracy or completeness. Under no circumstances shall WFP be liable for any direct or
consequential loss, personal injury, property damage, or expense of any nature directly or indirectly incurred or suffered by
any person that is claimed to have resulted from the use of this training document or information included therein.
This training document contains advice, opinions, and statements of several individuals and information providers. WFP does
not necessarily share, represent or endorse any advice, opinion, statement or other information provided by any individual or
information provider. Reliance upon any such advice, opinion, statement, or other information shall also be at the user’s own
risk. WFP shall not be liable to any person for any inaccuracy, error, omission, deletion, defect, failure, alteration or use of
any content herein, regardless of cause, nor for any damages resulting therefrom.
This training document also contains links to third-party websites. These links are provided for the user’s convenience only.
WFP is not responsible for the contents of any linked website and do not control or guarantee the accuracy, relevance,
timeliness, or completeness of any linked information. Further, the inclusion of links is not intended to assign importance to
those sites and the information contained therein, nor is it intended to endorse or recommend any views expressed,
commercial products or services offered on these external sites that are not controlled by WFP. Reference in the GIS Training
Package to any specific commercial products, processes, or services, or the use of any trade, firm or corporation name is
only for the information and convenience of the user and does not constitute an endorsement, recommendation, or judgment
by WFP.
WFP is not liable for any infringements of third party rights as a result of any form of publication by the users of this training
document, or sections thereof, and they alone will be liable.
The designations employed and the presentation of material in maps contained in this document do not imply the expression
of any opinion on the part of WFP concerning the legal or constitutional status of any country, territory, city or sea , or
concerning the delimitation of its frontiers or boundaries.
Nothing related to this statement or this training document may be construed as a waiver, express or implied, by WFP, the
United Nations, and the Food and Agricultural Organization of the United Nations of the privileges and immunities enjoyed by
them pursuant to the 1946 Convention on the Privileges and Immunities of the United Nations, the 1947 Convention on the
Privileges and Immunities of the Specialized Agencies or otherwise under any international or national law, convention or
agreement.
All queries on rights and licenses should be addressed to the Geospatial Support Unit, World Food Programme,
Managing
Distributed
Data
5
Introduction
The GIS training “Managing distributed data” is part of the collection of trainings on
geodatabase management produced by the Geospatial Support Unit (GSU), at the
Emergency Preparedness and Response branch, to support the implementation of
the GIS Infrastructure in the World Food Programme.
This specific training module has been designed with the aim of guiding the user
through the usage of a set of tools provided by ESRI to exchange information
between GeoDatabases distributed in different locations. In particular we will use
replica and synchronization tools described in this training module to share GIS
datasets between WFP offices where GeoDatabases are in place.
Building a network of distributed geodatabase at HQ, Regional Bureau and Country
Office level has been identified as a solution to improve data availability and
performance by alleviating server contention and slow network access to a central
server. Each GIS officer in WFP can work on data stored in a Geodatabase located
in his office and get data produced by others or share his data with others by
taking advantage of Geodatabase replication and synchronization techniques.
All the tools described in this training module allow replicating and synchronizing
data across ArcSDE GeoDatabases; therefor you must have access to a Desktop,
Workgroup or Enterprise GeoDatabase to go through this training.
In addition such tools are provided by ESRI in a toolbar in ArcMap, so make sure
you have ArcGIS Desktop 10.2.2 Advanced installed in your machine.
In case one of these constraints is not respected, please get in touch with your
regional GIS officer or with the GSU unit at the HQ, by sending an email to
[email protected].
Specific workflows have been identified in WFP to exchange GIS data between
offices, taking advantage of the different types of replication provided by ESRI.
In particular for all data produced and maintained at the WFP Headquarters, one-
way replication will be put in place to share it with Regional Bureaux and Country
offices. The same will happen, but opposite way, for all those data produced at
Country Office Level, while two-ways replication will be used to exchange data that
could be edited by more GIS officers at the HQ, RBs or COs, such as logistics
assets, WFP facilities or warehouses locations.
Data replication and synchronization tools provided by ESRI have been designed to
work in different operating contexts; in scenarios where connectivity between
offices is strong and reliable, ensuring that replica geodatabases are accessible on
the network at all times, and environments where internet connectivity is weak or
not available.
Before creating the replica you must determine which datasets to replicate and
prepare them ensuring they meet the following requirements:
o The database user in the connection file used to create the replica must have
write aces to the data.
This command is available for stand-alone feature classes, feature datasets, tables
and attributed relationship classes in a geodatabase but it cannot be executed on
individual datasets in a feature dataset. If you add a feature class to a feature
datasets that already has GlobalIDs for all feature classes, you have to run again
the Add GlobalID command on the feature dataset. It will add the GlobalID only to
the feature classes without GlobalID columns.
Managing
Distributed
Data
11
1.3.2 Registering data as versioned
You can only register your data as versioned when it is not in use by other people
or applications, because an exclusive lock is required to ensure the dataset is not
changing while the adds table is being created.
4. Make sure the checkbox Register the selected objects with the option to
move edits to base IS NOT selected.
5. Click OK.
12
Managing
Distributed
Data
Spatial: You can limit the extent included in the replica by using the current view
extent of the ArcMAP document you are using to create the replica. To do
this you have to select a specific option in the Advanced Create Replica
option that will be described in the chapter 2. All the features intersecting
this filter will be included in the replica and will be affected by
synchronization as per image below:
Selections: If there is a selection of certain objects in the datasets you are
including in a replica, only these features will be included in the
replica.
QueryDefs: Definition queries can be used to filter the content of feature classes
and tables.
Managing
Distributed
Data
13
If more than one filter is used, the intersection of all filters is applied.
Metadata for the data you choose to replicate is copied during the replica creation
process. However, changes to the metadata are not applied during replica
synchronization.
o In case you want to add to the replica more layers stored in the same feature
dataset, you can add just one of them to the ArcMap document because at a
later stage the Creating Replica Wizard will present you a list of objects you
want to include in the replica, taking into consideration the entire feature
dataset for each layer in the ArcMap document;
Managing
Distributed
Data
15
2.1 Creating replicas in a connected environment
If you have already prepared your data for replication, you can use the Create
Replica wizard in ArcMap to create checkout, one-way or two-way replicas.
For putting in place a replica in a connected environment two connection files are
needed: one connection file to your geodatabase and one connection file to the
destination geodatabase. Both connection files but must connect to the
geodatabase with a user with privileges to create new objects in the respective
geodatabase, otherwise once the replica will be in place you will have limited
privileges (for instance you won’t be able to change the schema of a layer adding
or removing fields).
You find below the steps to create a one-way or two-way replica. In the example
we have chosen to include in the replica a layer containing Earthquake events
stored in a feature dataset called NHR.
2. Make sure the objects you want to include in the replica have a GlonalID
column and are Registered as Versioned;
3. Connect to your local geodatabase using the connection file and load into the
ArcMap document the objects you want to include in the replica. In this case we
have the layer: wld_nhr_EQepicenters_usgs;
16
Managing
Distributed
Data
6. If layers that can be replicated are in your ArcMap document, the Create
Replica button on the Distributed Geodatabase toolbar will be active, click it
to activate the Create Replica wizard.
7. If your ArcMap document includes data from more than one ArcSDE
geodatabase you are prompted to choose which ArcSDE geodatabase you want
to work with.
8. At this step a the Creating Replica wizard will appear, letting you choose to
create either a check-out, a two-way, or a one-way replica. In this instruction
we are focusing on these last two replica types, for which the process for
creating are very similar. The main difference is that, in case you choose a one-
way replica, you have to select if it must be parent-to-child or child-to-parent.
In this example we select a two-way replica.
18
Managing
Distributed
Data
9. On the next panel, choose to replicate data or register existing data only.
The option to register existing data only allows you to create a replica between
two geodatabases containing the same data. It may be useful if your internet
connection is reliable but available bandwidth is limited. In this case you can
send data to the destination database by other means (FTP, physical drives etc)
and then create the replica. In this example we choose to replicate also the
data.
12. Check the Show advanced options box and click next to proceed.
13. The first panel of the Advanced Create Replica Options offers the choice
between a Full or Simple model. Keep the default Full model and click next.
14. The next window let you decide if you want to limit the replica to a specific
extent (the one in the ArcMap document window selected at the beginning),
use the full extent of the data or specify manually an extent.
Through this window you can also exclude individual layers or tables from the
replica, unchecking the check box associated with that layer or table.
As mentioned before, if you selected in the first step a layer stored in a
Feature Dataset, a list of all the layers in that FD will be presented.
To include in the replica only your Earthquake layer, uncheck all the others.
You can also change the dataset's name when it is replicated, under the Target
Name column. It may be useful if you are replicating only a portion of the data.
If you have previously chosen the option to register existing data only, then
each name in the Target Name column will offer a drop-down list of available
datasets in the target geodatabase to choose from.
20
Managing
Distributed
Data
Each entry in the Check Out column is a combo box of options. The options
always include All Features and Schema Only. If a particular layer or table
has a selection set or definition query defined, the options may also
include Selected Features Only, All Features in Def. Query, and Selected
Features in Def. Query.
Select All Features and uncheck the Use Spatial Extent box if you do not want to
apply any data filters.
Uncheck the Replicate related data check box if you do not want to replicate
any related data and click Next. This option may be useful if relationship classes
are associated to the layers in the replica.
o When replicating tables the default option is to replicate the schema only; in
case you want replicate the data as well, make sure to select All Records in
the Check Out field, as shown in the image below;
Managing
Distributed
Data
21
15. For two-way replication, there is an additional panel in the Advanced Create
Replica Options dialog box that lets you choose whether the parent or child
will be the initial data sender. This option is only important in disconnected
systems.
17. At this step you can choose what you want to do once the replication has been
completed
Take no further action — This is the default option. If you choose this
option, the replica will be created with the ArcMap document showing the
original data.
Change the layers and tables to point to the replicated data — The
current ArcMap document will be modified to point to the data in the
replica geodatabase, preserving all symbology.
Save a copy of this map document with the layers and tables
pointing at the replicated data — A new ArcMap document referencing
the data in the replica geodatabase with symbology preserved will be
created.
18. Click Summary to review the parameters for the current replica.
19. Click Finish to start replicating the data. The status of the replication
operation will be monitored in a progress dialog box.
Managing
Distributed
Data
23
If you are interested in getting to know how to create a checkout replica please
follow the guide on the ESRI portal:
http://resources.arcgis.com/en/help/main/10.2/index.html#/Creating_a_check_out
_replica/003n000000v5000000/
In this section we’ll describe the process to create a replica when Internet
connectivity is not reliable or weak. After several tests at Country Office and
Regional Bureau level we can say that is the process that has more chances to
succeed when putting in place replicas in WFP, considering networking constraints
in our organization.
The process is slightly more involved than in a connected environment simply
because there are more things to remember and steps to follow.
It can be summarized in a 3 phases process:
At some point, when we described the steps to put in place a replica in a connected
environment, we choice to send it to the destination geodatabase online by
selecting a connection file. In this case we have to save the replica in an XML file,
because we cannot ensure that the destination geodatabase is reachable remotely.
Once the replica is created, it must be sent to the destination geodatabase by
FTP transfer, email or physical drives and then imported in the destination
geodatabase to finalize the replica creation process.
24
Managing
Distributed
Data
2. Make sure the layer you want to include in the replica has a GlobalID column
and is Registered as Versioned;
3. Connect to your local geodatabase using the connection file and load into the
ArcMap document the layer you want to include in the replica. In this case we
have the layer: mmr_poi_facilities_wfp
Managing
Distributed
Data
25
9. In the next window choose to replicate the full extent of the data.
In the list of items to checkout you will notice that there is not only the WFP
facilities layer but also a layer containing warehouses locations. This happens
because both layers are stored in the same feature dataset (WFP).
In this example we uncheck the warehouses layer to include in the replica only
WFP facilities.
Once the process is completed the replica is registered in the source geodatabase
and all the data needed to create the replica in the destination geodatabase is
stored in the xml file.
Please note that in most of the cases the generated XML will be too big to be sent
by email. In this case the preferred option to send the file is the FTP transfer. FTP
servers have been made available at HQ and at each Regional Bureau to support
the implementation of the GIS infrastructure in WFP. Therefore you should get the
necessary information to access both FTP servers (HQ and at your reference RB),
before creating this kind of replica.
2. Choose whether to import the geodatabase and its data by clicking the Data
checkbox.
3. Click the open folder button and navigate to the XML document containing the
replica, select it and click Open.
4. Click Next.
5. The window shown in the image below will appear. It presents a list of objects
that will be imported in the destination geodatabase, any related data (Domain,
Relationship Classes, Topologies etc.) will also appear.
Managing
Distributed
Data
29
Any naming conflicts are displayed in red. To change a suggested name in
the Target Name column, overwrite the name.
For two-way and one-way replication, the same filters and relationship class rules
used in replica creation are applied during synchronization, with the exception of
filters based on a selection set. When determining the changes to send, all edits in
each replica dataset that have been applied since the last synchronization are
evaluated. If an edit satisfies the replica's filters, it will be synchronized.
Additionally the geodatabase synchronization system detects errors occurring
during synchronizations and rollbacks in case needed. Any changes that have been
applied are removed, and the system is put back as it was before the
synchronization.
You can connect to your geodatabase and choose the replica to synchronize. All the
required message exchanges needed to complete synchronization are executed by
the system. You never have to be concerned with message exchange or which
replica is the sender or receiver.
Moreover connected synchronization allows you to choose the direction in which the
changes will be sent in a two-way replica. For example, you can send changes from
the parent replica to the child replica or from the child replica to the parent replica,
or in both directions. If you choose both directions, changes are first sent in one
direction and then sent in the opposite direction, all in one operation.
4. For two-way replicas, choose the direction in which you would like to send
changes. For checkout replicas, the only option available is to send changes
from the child replica to parent replica. For one-way replicas, the only option
available is to send changes from the parent replica to the child replica, or if it's
a child-to-parent replica.
Managing
Distributed
Data
33
5. Click Next.
6. For checkout replicas, there is an option to reconcile and post with the parent
version upon synchronization. For two-way and one-way replicas, this is always
checked on.
9. Choose how you want conflicts resolved. (the first two are usually preferred)
Manual — With this policy, if a conflict occurs, the reconcile operation is aborted
and the replica is marked as in conflict. This gives you an opportunity to perform
the reconcile afterwards either manually or by running some custom reconcile
code. Once the reconcile is applied and the changes posted to the replica
version, the replica is no longer in conflict. While the replica is in conflict it can
continue to receive changes but cannot send changes.
This happens through the multi-steps process described in the image below:
The data sender exports changes to be applied to the relative replica into a Data
Change Message and send it to the receiver. The data receiver imports this
message into his geodatabase and exports an Acknowledgment Message and
send it to the data sender to confirm that data changes were received and
processed. The data sender imports the Acknowledge Message into its geodatabase
to complete the process.
36
Managing
Distributed
Data
It is important for the data receiver to export acknowledgment messages as often
as possible. If no acknowledgment messages are received, the data sender resends
changes by default, and maintains the information needed to resend changes until
those changes are acknowledged. As a result, the data sender's geodatabase can
become large, and subsequent data change messages can also become large.
If you have particular needs that are not covered by this workflow please refer to
the ESRI’s documentation or get in touch with your GIS focal point in WFP.
http://resources.arcgis.com/en/help/main/10.2/index.html#/Disconnected_synchronizatio
n/003n000000v2000000/
We’ll now describe the entire process with a practical example, using the
disconnected replica created in the second Chapter
(Myanmar_to_HQ_WFPFacilities)
After adding a new feature in the source geodatabase we start the process to send
this change to the destination geodatabase through a disconnected
synchronization.
Managing
Distributed
Data
37
3.2.1 Export a data change message
1. Add the layer to an ArcMap document; (Highlighted in a red box the new point)
2. Open the Distributed Geodatabase toolbar and click the Export Data Changes
Message button.
3. In the wizard, select the replica from which you would like to export data
changes and specify if you want to save the message in a delta database or an
XML file. We suggest using an XML file if there is few data to synchronize, while
the geodatabase option is preferred in case you have done a lot of edits.
38
Managing
Distributed
Data
Specify a self-explanatory name for the Data Change message. It should contain at
least the name of the Replica and the date as per the example below:
ExportaDataChanges_Myanmar_To_HQ_BaseLayers_17122014.xml
4. The last 3 check boxes are useful to manage particular needs. In general you
should leave the last 2 checked.
1) Include all unacknowledged data changes and new data changes since the
last export.
2) Include only those unacknowledged data changes that you have not received
an acknowledgment message for.
Managing
Distributed
Data
39
3) Don't include any data change messages. This option is useful for two-way
replica, for sending a message to switch roles without sending any data.
Once the Data Change Message is saved in you file system, you have to send it to
the person managing the destination geodatabase as described in the section
Creating replicas in a disconnected environment.
In order to synchronize edits made in the source geodatabase the person managing
the destination geodatabase must import the Data Change Message received.
1. Add one of the layers included in the replica that needs to be synchronized to
an ArcMap document and click the Import Message button that can be found
in the Distributed Geodatabase toolbar.
2. If prompted, choose the replica geodatabase you want to import a message to.
It is needed only if in the ArcMap document there are layers stored in different
geodatabases.
3. Choose the delta file you would like to import. (XML or Geodatabase)
5. Click Finish to finalize the process. Now the new point is present also in the
destination geodatabase.
Managing
Distributed
Data
41
3.2.3 Export an Acknowledge Message
3. In the wizard, select the replica from which you would like to export the
acknowledge message and specify its name.
42
Managing
Distributed
Data
Specify a self-explanatory name. It should contain at least the name of the Replica and
the date as per the example below:
AcknowledgeMessage_Myanmar_To_HQ_Earthquakes_20012015.xml
4. Click Finish
Once the Acknowledge Message is saved in your file system, you have to send it to
the person managing the source geodatabase as described in the section Creating
replicas in a disconnected environment.
1. Add one of the layers included in the replica that needs to be synchronized to
an ArcMap document and click the Import Message button that can be
found in the Distributed Geodatabase toolbar.
# Set workspace
workspace = r"C:\wfp_sdi_confs\synchro_script"
When a replica is created, data and schema are copied from the source
geodatabase (Parent) to the destination geodatabase (Child).
The data includes the rows to be replicated from the datasets in the replica.
The schema consists of the fields, domains, subtypes, and other properties that
describe the replicated data.
Initially, the schemas are identical on both replicas, but over time, changes might
be applied to the schema in each geodatabase.
In these cases before synchronizing the replica you have to run through a Schema
Change process. It first compares the schema of layers in both geodatabases
(source and destination) then modifies the structure of the geodatabase receiving
the Schema Change by modifying the structure.
In case you are working with a “connected“ replica you can execute a schema
change using the Compare Replica Schemas tool to check differences in the
structure of the layers in the two geodatabases and then the Import Schema
Changes tool to apply changes on the destination geodatabase.
1. Connect to the destination geodatabase and add the layers included in the
replica that needs to be synchronized to an ArcMap document and click the
Compare Replica Schema button that can be found in the Distributed
Geodatabase toolbar.
48
Managing
Distributed
Data
2. Specify or browse the connection file to the source geodatabase.
SchemaChange_HQ_TO_OMB_Earthquakes_20012015.xml
5. Click Finish.
Managing
Distributed
Data
49
4.1.2 Import replica schema
1. Connect to the destination geodatabase and add one of the layers included in
the replica that needs to be synchronized to an ArcMap document and click
the Import Schema Changes button that can be found in the Distributed
Geodatabase toolbar.
2. Browse for the replica schema changes file. Replica name and Replica type
will be automatically filled in.
3. Click Next
50
Managing
Distributed
Data
4. The second dialog box lists the differences between the two schemas. Check
the check boxes under the Apply column for the changes you want to apply
to the replica schema.
5. Click Finish.
At this point the Earthquake layer in the destination geodatabase should have the
“schematest” field and any synchronization should work properly.
In case you are working with a “disconnected“ replica, before comparing the
schemas and import schema changes you must export your schema to a file, send
it to the destination replica where it can be compared and imported.
At this scope we’ll use an additional tool provided in the Distribution Geodatabage
toolbar called Export Replica Schema.
Managing
Distributed
Data
51
For this example we’ll remove the int_staff field from the WFPFacilities layer
which we included in a disconnected replica in the Chapter 2
1. Connect to the source geodatabase, add the layer included in the replica that
needs to be synchronized to an ArcMap document and click the Export
Replica Schema button that can be found in the Distributed Geodatabase
toolbar.
1. Connect to the destination geodatabase and add the layers included in the
replica that needs to be synchronized to an ArcMap document and click the
Compare Replica Schema button.
OutputSchemaChange_Myanmar_to_HQ_20012015.xml
Managing
Distributed
Data
53
5. Click Finish.
1. Connect to the destination geodatabase and add one of the layers included in
the replica that needs to be synchronized to an ArcMap document and click
the Import Schema Changes button.
2. Browse for the replica schema changes file. Replica name and Replica type
will be automatically filled in.
3. Click Next
4. The second dialog box lists the differences between the two schemas. Check
the check boxes under the Apply column for the changes you want to apply
to the replica schema.
54
Managing
Distributed
Data
5. Click Finish.
You can use this tool to rename, refresh, and review the properties of each replica
as well as remove datasets from a replica. You may also view all replicas with a
role of parent or child; list only checkout/check-in, one-way, two-way, or all types
of replicas; and view the replica log. This utility is available in both ArcCatalog and
ArcMap.
To open the Replica Manager, click the Manage Replicas button on the Distributed
Geodatabase toolbar.
In the example below replicas created in the Second Chapter are listed.
56
Managing
Distributed
Data
For each replica you can access information regarding:
• Name
• Type — The type of replica created: Check out/Check in, One way, or Two way
• Date Created — The date and time that the replica was created
• Rename — Renames the replica; type a new name and press ENTER.
• View log — Opens the Replica Log, which maintains a record for each data
message sent or received by the replica.
• Refresh — Refreshes the replica; the latest state of the replica properties are
displayed.
This last option is particularly interesting because it allows changing some replica
parameters that are fundamental for the workflows identified in WFP.
It contains three tabs: General, Description, and Advanced.
The General tab shows some additional replica parameters, such as the replica
and synchronization versions.
58
Managing
Distributed
Data
The Description tab shows the datasets that are included in the replica and allows
to remove them by right clicking and choosing UnRegister from Replica;
however, they still remains in the geodatabase.
You can also see if any filters were applied to the replica dataset during replica
creation by choosing View filters.
The Advanced tab in the Replica Properties dialog box shows information about
the generation numbers associated with the replica. The generation number is a
number maintained by the geodatabase, which keeps track of messages being sent
and received by the replica. For example, the first data change message sent from
one replica to its relative would make the current generation of the replica 1. When
the relative imports that message, its relative replica generation gets set to 1.
When the replica receives an acknowledgment of the data change message, its last
acknowledged generation gets set to 1.
Managing
Distributed
Data
59
The Advanced panel displays the following information about generation numbers:
The Relative Replica Connection contains the connection information of the relative
replica. You can set this information by browsing to the location of the relative
replica's connection file.
60
Managing
Distributed
Data
Replicas created using the workflow for connected environments already know how
to reach the relative geodatabase because the connection files to it was provided at
the replica creation time. In this case for both geodatabases the Relative Replica
Connection in the Replica Manager will already include this information.
Instead replicas created using the workflow for disconnected environment are not
aware of this information. The connection file to reach the relative geodatabase
needs to be provided manually if you want to synchronize data using the workflow
thought for connected environments, otherwise you will only be able to use the
disconnected one.
In contexts with limited bandwidth capacity, big latency or internet connectivity not
reliable you may want to create replica using the workflow for disconnected
environments but then be able to synchronize data online, when internet
connectivity allows, and use the synchronization workflow thought for disconnected
environments otherwise.
1. The person who initially created the disconnected replica must add to and
ArcMap document one of the layers in the replica and Open the Replica
Manager tool;
4. Check the Persist user name and password check box. If the user name and
password are not persisted, you will be prompted to provide them every time
you want to run a synchronization online.
62
Managing
Distributed
Data
6 Data sharing workflows in WFP
Specific workflows have been designed to facilitate GIS data sharing across WFP
offices; they make use of distributed geodatabases and replica and synchronization
tools to overcome well known networking issues in our organization.
In general we can consider that GIS officers at the WFP headquarters produce and
maintain some datasets that may be of interest for GIS colleagues working in
regional and country offices, such as layers containing information related to
natural hazards (tropical storm tracks, earthquake epicenters, flood extent areas
etc.). At the same time GIS colleagues working in the field collect and maintain
data extremely interested for other GIS practitioners in WFP, such as population
figures, food security and vulnerability data, context related information etc.
All these datasets can be shared across offices setting up one replica for each flow
of information. For instance we’ll create a replica containing all data produced at
the WFP head quarters in which the geodatabase at HQ is the parent and the
destination geodatabase at the CO or RB is the child. One or more replica will then
be created for data sitting and CO or RB level, in this case the roles are inverted,
CO or RB geodatabases are parent and the one at HQ is the child.
Additionally we’ll consider a set of data that is managed at the HQ with a global
extent and a standardized data structure, to which the CO can contribute by editing
existing layers. These include WFP locations of interest (WFP facilities, warehouses)
Managing
Distributed
Data
63
and logistics infrastructures (airports, ports, unhas routes, border crossing points
etc.). Such data will be included in an additional replica that from the CO will
populate global layers maintained at the headquarters.
As these layers are managed at HQ with a global extent, at the replica creation
time spatial filters will be applied according to the spatial extent of the Country
where the destination database is hosted.
This will be a one way replica put in place between the HQ (Parent db) and the
Country Office\Regional Buereau (child DB) using the process for disconnected
environments described in the paragraph 2.2.
After creating the disconnected replica, we need to add the connection file to reach
the destination geodatabase using the steps reported in the paragraph 5.1.
This will enable the replica to be synchronized using the workflow for connected
environments.
64
Managing
Distributed
Data
At this step the replica can be synchronized using either the workflow for connected
environments or the one for disconnected environments.
When this replica will be put in place for all Country Offices and Regional Buereaux,
there will be to many replicas to run manual synchronization. For this reason
automatic synchronization workflows will be put in place using the reference
documentation in the paragraph 3.2.5.
In this section we’ll provide some additional information to facilitate the creation of
replicas to send GIS related data from the Country Office\Regional Bureaux to the
WFP Headquarters.
After standardizing GIS datasets and storing them in the Enterprise Geodatabase at
the Country Office, one or more replicas can be put in place to share this data with
other GIS practitioners at the WFP headquarters and regional offices.
The number of replica to create depends mainly to the amount of data available at
the country office and how frequently it is updated.
It is suggested to create group together layers with the same frequency of update,
for instance layers containing administrative boundaries are not updated often
while table containing secondary data maybe be updated on a daily base.
In this case the best strategy would be to create one replica containing all
boundaries datasets, another replica containing all tables, and additional ones
according to update frequency.
All these replicas can be created using a workflow similar to the one described in
the previous paragraph.
Managing
Distributed
Data
65
Each of them will be a one way replica where the parent geodatabase is the
Enterprise Geodatabase residing at the country office and the child geodatabase in
this case is the Enterprise Geodatabase at the HQ.
These replicas will be created using the process for disconnected environments
described in the paragraph 2.2.
After creating the disconnected replica, we need to add the connection file to reach
the destination geodatabase using the steps reported in the paragraph 5.1.
This will enable the replica to be synchronized using the workflow for connected
environments.
At this step the replica can be synchronized using either the workflow for connected
environments (automatically or manually) or the one for disconnected
environments.
Some of the layers used daily at the WFP HQ for mapping or analysis purposes are
managed at a global extent. This set of layers include:
1) Share the standardized data structure by creating a 2 ways replica between the
HQ and the CO, which will include all the layers managed at a global extent. At
the replica creation time filters will be applied to replicate only data of interest
for that specific country. Each layer contains in the attribute table an iso3 field;
for instance, when replicating data for Myanmar, definition queries will be
created to replicate features containing only the iso3 = MMR.
2) At the CO, edit replicated layers by filling in information available. For instance
add locations of WFP offices if not already existing or close warehouses that are
not available anymore.