v2.1.0 Migration

Beginning in release 2.1.0, MapD Core provides an individual FileMgr for every table.

This change gives you the following benefits:

  1. Fixes potential race conditions when data is loaded into multiple tables simultaneously.
  2. Improves database start-up time.
  3. Creates a foundation for further improvements in the storage layer.

To take advantage of this enhancement, you must convert your existing data structures. MapD Core 2.1.0 includes a --db-convert option that moves your current data to the new structure.

Prerequisites

Before you convert your data and install MapD Core 2.1.0, you must do the following.

  • Update your MapD instance to version 2.0.4.

Converting Your Database

In the following example, $MAPD_PATH is the path to the MapD installation directory, commonly /opt/mapd. $MAPD_DATA is the path to the directory containing the MapD storage directory structure. This is specified under the data property of your mapd.conf configuration file. It is commonly /var/lib/mapd/data.

Before starting the migration process, determine the values of these variables and set them in your environment:

export MAPD_PATH=/opt/mapd
export MAPD_DATA=/var/lib/mapd/data

To convert an existing data structure to the new data structure:

  1. Stop your MapD server instance.

  2. Create a backup copy of the original MapD data directory:

    cp -R $MAPD_DATA $MAPD_DATA-backup
    
  3. Go to the $MAPD_DATA directory:

    cd $MAPD_DATA
    
  4. Rename the original mapd_data directory and create a new one for the new storage data structures.

    mv mapd_data mapd_data-backup
    mkdir mapd_data
    
  5. Copy dictionaries from the old data structure to the new data structure.

    cp -R mapd_data-backup/DB_* mapd_data/
    
  6. Set the ulimit limit on number file descriptors (nofile) to a minimum of 64000. On CentOS/RHEL this can be done for the current session using the following command.

    sudo -E sh -c "ulimit -n 65535 && exec su $LOGNAME"
    

    You can verify the new limit is activated by running ulimit -n. For further details see How to set ulimit values.

  7. Start MapD with the mapd_server command using the following two parameters (see MapD Core Services and Utilities).

Parameter Description
--data Absolute path to the new top-level data directory. For example /var/lib/mapd/data
--db-convert Absolute path to the original mapd_data directory. For example /var/lib/mapd/data/mapd_data-backup

For example, here is the same command run with full paths for both the old and new data structure directories:

$MAPD_PATH/bin/mapd_server --flush-log \
  --cpu --data $MAPD_DATA \
  --db-convert $MAPD_DATA/mapd_data-backup

When starting MapD with the option --db-convert, old data structures are converted to the new format with an individual FileMgr per table. You can monitor the migration process by tailing the mapd_server.INFO log file:

tail -F $MAPD_DATA/mapd_log/mapd_server.INFO

When loading a dataset using the SQL Importer, new data structures corresponding to an individual FileMgr per table are created automatically.

Note

If you choose to run --db-convert more than once, you must change the target directory or otherwise restore its content to be consistent with the results of step 4.