v2.1.0 Migration¶
Beginning in release 2.1.0, MapD Core provides an individual FileMgr for every table.
This change gives you the following benefits:
- Fixes potential race conditions when data is loaded into multiple tables simultaneously.
- Improves database start-up time.
- Creates a foundation for further improvements in the storage layer.
To take advantage of this enhancement, you must convert your existing data structures. MapD Core 2.1.0 includes a --db-convert
option that moves your current data to the new structure.
Prerequisites¶
Before you convert your data and install MapD Core 2.1.0, you must do the following.
- Update your MapD instance to version 2.0.4.
Converting Your Database¶
In the following example, $MAPD_PATH
is the path to the MapD installation
directory, commonly /opt/mapd
. $MAPD_DATA
is the path to the directory
containing the MapD storage directory structure. This is specified under the
data
property of your mapd.conf
configuration file. It is commonly
/var/lib/mapd/data
.
Before starting the migration process, determine the values of these variables and set them in your environment:
export MAPD_PATH=/opt/mapd
export MAPD_DATA=/var/lib/mapd/data
To convert an existing data structure to the new data structure:
Stop your MapD server instance.
Create a backup copy of the original MapD data directory:
cp -R $MAPD_DATA $MAPD_DATA-backup
Go to the
$MAPD_DATA
directory:cd $MAPD_DATA
Rename the original
mapd_data
directory and create a new one for the new storage data structures.mv mapd_data mapd_data-backup mkdir mapd_data
Copy dictionaries from the old data structure to the new data structure.
cp -R mapd_data-backup/DB_* mapd_data/
Set the
ulimit
limit on number file descriptors (nofile
) to a minimum of 64000. On CentOS/RHEL this can be done for the current session using the following command.sudo -E sh -c "ulimit -n 65535 && exec su $LOGNAME"
You can verify the new limit is activated by running
ulimit -n
. For further details see How to set ulimit values.Start MapD with the
mapd_server
command using the following two parameters (see MapD Core Services and Utilities).
Parameter | Description |
---|---|
--data |
Absolute path to the new top-level data directory. For example /var/lib/mapd/data |
--db-convert |
Absolute path to the original mapd_data directory. For example /var/lib/mapd/data/mapd_data-backup |
For example, here is the same command run with full paths for both the old and new data structure directories:
$MAPD_PATH/bin/mapd_server --flush-log \
--cpu --data $MAPD_DATA \
--db-convert $MAPD_DATA/mapd_data-backup
When starting MapD with the option --db-convert
, old data structures are converted to the new format with an individual FileMgr per table. You can monitor the migration process by tailing the mapd_server.INFO
log file:
tail -F $MAPD_DATA/mapd_log/mapd_server.INFO
When loading a dataset using the SQL Importer, new data structures corresponding to an individual FileMgr per table are created automatically.
Note
If you choose to run --db-convert
more than once, you must change the target directory or otherwise restore its content to be consistent with the results of step 4.