Configuration¶
MapD Core Database has minimal configuration requirements with a number of additional configuration options. This topic describes the required and optional configuration changes you can use in your MapD Core Database instance.
Data Directory¶
Before starting MapD Core Database, the persistent data
directory must be initialized. To do so, create an empty directory at the desired path, such as /var/lib/mapd
. Create the environment variable $MAPD_STORAGE
.
export MAPD_STORAGE = /var/lib/mapd
Change the owner of the directory to the user that the server will run as
($MAPD_USER
):
sudo mkdir -p $MAPD_STORAGE
sudo chown -R $MAPD_USER $MAPD_STORAGE
Where $MAPD_USER
is the system user account that the server runs
as, such as mapd
, and $MAPD_STORAGE
is the desired path to the parent of the MapD
Core Database data
directory.
Finally, run $MAPD_PATH/bin/initdb
with the data directory path as
the argument:
$MAPD_PATH/bin/initdb $MAPD_STORAGE
Configuration file¶
MapD Core supports storing options in a configuration file. This is useful if, for example, you need to run the MapD Core Server and Web Server on different ports than their defaults.
If you store a copy of mapd.conf
in the $MAPD_STORAGE directory, the configuration settings are picked up automatically by the sudo systemctl start mapd_server
and sudo systemctl start mapd_web_server
commands.
Set the flags in the configuration file using the format <flag> = <value>
. Strings must be enclosed in quotes. The following is a sample configuration file. The entry for data
path is a string and must be in quotes. The entry for the optional read-only
flag is the boolean value true
and is not in quotes.
port = 9091
http-port = 9090
data = "/var/lib/mapd/data"
read-only = true
[web]
port = 9092
frontend =
"/home/osboxes/installs/mapd-3.0.0-20170502-9e5ba95-Linux-x86_64-render/frontend"
Configuration Flags for MapD Server¶
Flag | Description | Default | Why Change It? |
---|---|---|---|
cpu |
Set this flag on a GPU installation to instruct it to use only CPUs. (You do not have to explicitly set this flag for a CPU-only installation.) | FALSE | One use case for disabling GPUs is during database conversion, which requires moving a large amount of data with minimal processing. |
gpu |
Run on GPUs and CPUs. | TRUE | Default. |
read-only |
Enable read-only mode. | FALSE | Prevents inadvertent (or nefarious) changes to the dataset. |
port |
Port number | 9091 | Change the port number if it collides with another service on the host. Ideally, your host only runs MapD services. |
ldap-uri |
ldap server uri | N/A | N/A |
ldap-ou-dc |
ldap Organizational Unit and Domain Component | =ou=users,dc=mapd,dc=com | N/A |
http-port |
HTTP port number | 9090 | Change the port number if it collides with another service on the host. Ideally, your host only runs MapD services. |
flush-log |
Force aggressive log file flushes. | FALSE | When you set this the system writes messages to disk as they are generated, rather than holding them until a particular threshold is reached. |
num-gpus |
Number of gpus to use | -1 | In a shared environment, you can assign the number of GPUs to a particular application. The default is -1, which means use all available GPUs. |
start-gpu |
First gpu to use | 0 | In a shared environment, if you want to reserve a set number of GPUs for a particular process, you can configure another process to use GPUs starting at a higher device ID. |
cluster |
Indicates that the MapD Core Database instance is an aggregator node, and where to find the rest of its cluster. | $MAPD_STORAGE | This setting is not likely to change in a production environment. |
string-servers |
Path to string servers list JSON file | $MAPD_STORAGE | This setting is not likely to change in a production environment. |
Configuration Flags for MapD Web Server¶
Flag | Description | Default | Why Change It? |
---|---|---|---|
-b | backend-url |
Url to http-port on mapd_server | http://localhost:9090 |
Change to avoid collisions with other services. |
--cert |
Certificate file for HTTPS | cert.pem |
Change for testing and debugging. |
-c | --config |
Path to MapD configuration file | Change for testing and debugging. | |
-d | --data |
Path to MapD data directory | data |
Change for testing and debugging. |
--docs |
Path to documentation directory | docs |
|
--enable-https |
Enable HTTPS support | Change to enable secure HTTP. | |
-f | --frontend |
Path to frontend directory | frontend |
|
--key |
Key file for HTTPS | key.pem |
Change for testing and debugging. |
-p | --port |
Frontend server port | 9092 |
Change to avoid collisions with other services. |
-r | --read-only |
Enable read-only mode | Prevent inadvertent (or nefarious) changes to the data. | |
--servers-json |
Path to servers.json | Change for testing and debugging. | |
--timeout |
Maximum request duration in #h#m#s format |
1h0m0s |
The --timeout option controls the maximum duration of individual HTTP requests. This is used to manage resource exhaustion caused by improperly closed connections. One side effect of this option is that it limits the execution time of queries made over the Thrift HTTP transport. This timeout duration must be increased if queries are expected to take longer than the default duration of one hour: for example, if you perform a COPY FROM on a large file when using mapdql with the HTTP transport. |
--tmpdir |
Path for temporary file storage | /tmp |
The temporary directory is used as a staging location for file uploads. It is sometimes desirable to place this directory on the same file system as the MapD Core data directory. If not specified on the command line, mapd_web_server also respects the standard TMPDIR environment variable as well as a specific MAPD_TMPDIR environment variable, the latter of which takes precedence. If you use neither the command line argument nor one of the environment variables, the default, /tmp/ is used. |
-v | --verbose |
Print all log messages to stdout | Change for testing and debugging. | |
--version |
Return version |
Using Configuration Flags on the Command Line¶
To use options provided in a configuration file, set the --config
flag to the path of the configuration file for mapd_server
and mapd_web_server
. For example:
$MAPD_PATH/bin/mapd_server --config $MAPD_STORAGE/mapd.conf
You also have the option of specifying configuration settings at the command line. MapD recommends that you use the systemctl
command to start and stop the servers, but you can use the mapd_server
and mapd_web_server
commands with configuration flags for testing and debugging.
Command Line Configuration Flags for mapd_server¶
Flag | Description | Default | Why Change It? |
---|---|---|---|
--config arg |
Path to mapd.conf | none | One use case might be to temporarily set a different configuration file during testing and troubleshooting. |
--data arg |
Directory path to MapD catalogs | $PWD/data | You can set the path anywhere you choose. |
--cpu |
Set this flag on a GPU installation to instruct it to use only CPUs. (You do not have to explicitly set this flag for a CPU- only installation.) | FALSE | One use case for disabling GPUs is during database conversion, which requires moving a large amount of data with minimal processing. |
--gpu |
Run on GPUs and CPUs. | TRUE | Default. |
--read-only |
Enable read-only mode. | FALSE | Prevents inadvertent (or nefarious) changes to the dataset. |
-p [ --port ] arg |
Port number | 9091 | Change the port number if it collides with another service on the host. Ideally, your host only runs MapD services. |
--ldap-uri arg |
ldap server uri | N/A | N/A |
--ldap-ou-dc arg |
ldap Organizational Unit and Domain Component | =ou=users,dc=mapd,dc=com | N/A |
--http-port arg |
HTTP port number | 9090 | Change the port number if it collides with another service on the host. Ideally, your host only runs MapD services. |
--flush-log |
Force aggressive log file flushes. | FALSE | When you set this the system writes messages to disk as they are generated, rather than holding them until a particular threshold is reached. |
--num-gpus arg |
Number of gpus to use | -1 | In a shared environment, you can assign the number of GPUs to a particular application. The default is -1, which means use all available GPUs. |
--start-gpu arg |
First gpu to use | 0 | In a shared environment, if you want to reserve a set number of GPUs for a particular process, you can configure another process to use GPUs starting at a higher device ID. |
-v [ --version ] |
Print release version number. | N/A | N/A |
--cluster arg |
Indicates that the MapD Core Database instance is an aggregator node, and where to find the rest of its cluster. | $MAPD_STORAGE | This setting is not likely to change in a production environment. |
--string-servers arg |
Path to string servers list JSON file | $MAPD_STORAGE | This setting is not likely to change in a production environment. |
Command Line Configuration Flags for mapd_web_server¶
Flag | Description | Default | Why Change It? |
---|---|---|---|
-b | backend-url string |
Url to http-port on mapd_server | http://localhost:9090 |
Change to avoid collisions with other services. |
--cert string |
Certificate file for HTTPS | cert.pem |
Change for testing and debugging. |
-c | --config string |
Path to MapD configuration file | Change for testing and debugging. | |
-d | --data string |
Path to MapD data directory | data |
Change for testing and debugging. |
--docs string |
Path to documentation directory | docs |
|
--enable-https |
Enable HTTPS support | Change to enable secure HTTP. | |
-f | --frontend string |
Path to frontend directory | frontend |
|
--key string |
Key file for HTTPS | key.pem |
Change for testing and debugging. |
-p | --port int |
Frontend server port | 9092 |
Change to avoid collisions with other services. |
-r | --read-only |
Enable read-only mode | Prevent inadvertent (or nefarious) changes to the data. | |
--servers-json string |
Path to servers.json | Change for testing and debugging. | |
--timeout duration |
Maximum request duration in #h#m#s format. For example 0h30m0s represents a duration of 30 minutes. |
1h0m0s |
The --timeout option controls the maximum duration of individual HTTP requests. This is used to manage resource exhaustion caused by improperly closed connections. One side effect of this option is that it limits the execution time of queries made over the Thrift HTTP transport. This timeout duration must be increased if queries are expected to take longer than the default duration of one hour: for example, if you perform a COPY FROM on a large file when using mapdql with the HTTP transport. |
--tmpdir string |
Path for temporary file storage | /tmp |
The temporary directory is used as a staging location for file uploads. It is sometimes desirable to place this directory on the same file system as the MapD Core data directory. If not specified on the command line, mapd_web_server also respects the standard TMPDIR environment variable as well as a specific MAPD_TMPDIR environment variable, the latter of which takes precedence. If you use neither the command line argument nor one of the environment variables, the default, /tmp/ is used. |
-v | --verbose |
Print all log messages to stdout | Change for testing and debugging. | |
--version |
Return version |