Developer's Guide to Empress Data Migration Tool

Introduction

In version 9 of the Empress Framework, we introduced the Data Migration Tool. This tool was designed to simplify the process of syncing data between a remote source and a DocType, acting as a middleware layer between your Empress based website and a remote data source.

Today, we’ll delve into the technical depth of this tool, creating a connector to push ERPNext Items to an imaginary service we’ll call Atlas.

Data Migration Plan

A Data Migration Plan encapsulates a set of mappings. To create a new Data Migration Plan, set the plan name as ‘Atlas Sync’ and add mappings in the mappings child table.

Data Migration Mapping

A Data Migration Mapping represents a set of rules that dictate field-to-field mapping.

To create a new Data Migration Mapping, name it ‘Item to Atlas Item’. To define a mapping, input some values that outline the structure of local and remote data.

  • Remote Objectname: A name that identifies the remote object, for example, Atlas Item.
  • Remote primary key: This is the name of the primary key for Atlas Item, for example, id.
  • Local DocType: The DocType which will be used for syncing, for example, Item.
  • Mapping Type: A Mapping can be of type ‘Push’ or ‘Pull’, depending on whether the data is to be mapped remotely or locally. It can also be ‘Sync’, which will perform both push and pull operations in a single cycle.
  • Page Length: This defines the batch size of the sync.

Field Mappings:

Field mappings are typically defined by specifying fieldnames of the remote and local object. For one-way mapping (push or pull), the source field name can also take literal values in quotes (e.g "GadgetTech") and eval statements (e.g "eval:frappe.db.get_value('Company', 'Gadget Tech', 'default_currency')"). For example, in a push mapping, the local fieldname can be set to a string in quotes or an eval expression, rather than a field name from the local doctype.

After defining field mappings, add the ‘Item to Atlas Item’ mapping to our Data Migration Plan and save it.

Pre and Post Process:

Data migration often requires additional steps beyond one-to-one mapping. For a Data Migration Mapping that is added to a Plan, a mapping module is generated in the module specified in that plan.

Here, you can implement the pre_process (which receives the source doc) and post_process (which receives both source and target docs, along with any additional arguments) methods to extend the mapping process.

Data Migration Connector

To connect to the remote source, create a new Data Migration Connector.

To do this, add another Connector Type in the Data Migration Connector DocType. Then, create a new Data Migration Connector, choosing the Connector Type as Atlas. You then have to add the hostname, username, and password for your Atlas instance for authentication purposes.

Next, write the code for your connector in a new file called atlas_connection.py in the directory frappe/data_migration/doctype/data_migration_connector/connectors/.

In this file, implement the insert, update, and delete methods for the atlas connector and write the code to connect to our Atlas instance in the __init__ method.

After creating the Atlas Connector, import it into data_migration_connector.py.

Data Migration Run

The final step is to create a new Data Migration Run.

A Data Migration Run executes a Data Migration Plan and Data Migration Connector according to our configuration. It takes care of queueing, batching, delta updates, and more.

Upon execution, the Data Migration Run will push our Items to the remote Atlas instance, displaying the progress in real-time.

Note that a run cannot be executed again once it is completed. You will have to create another run and execute it. The Data Migration Run is designed for efficiency; the next time you execute it, it will only push those items which were changed or failed in the last run.

In conclusion, the Data Migration Tool in Empress Framework is a powerful feature for developers, enabling seamless data syncing between local and remote data sources. It provides an extensive degree of control and customization to ensure that your data migration meets your specific needs.