Client Interfaces

Apache Thrift

MapD Core uses Apache Thrift to generate client-side interfaces. The interface definitions are in $MAPD_PATH/mapd.thrift. See Apache Thrift documentation to learn how to generate client-side interfaces for different programming languages with Thrift. Also see $MAPD_PATH/samples for sample client code.

Python (via JayDeBeApi)

MapD Core supports Python via JayDeBeApi. mapd_jdbc.py in the sample code directory is a wrapper around jaydebeapi that returns a standard Python Connection object. The code assumes that the MapD JDBC driver (mapdjdbc-1.0-SNAPSHOT-jar-with-dependencies.jar) is available in the same directory. Users may create a cursor object using the returned connection object. Please be sure to close the connection at the end of your Python script.

Before using, ensure that jaydebeapi is installed by running:

pip install jaydebeapi

The jar is available at $MAPD_PATH/bin/mapdjdbc-1.0-SNAPSHOT-jar-with-dependencies.jar

The host is <machine>:<port> with the standard port of 9091

The following example uses the mapd_jdbc wrapper to query MapD Core and plot the results with pyplot. The code is available in $MAPD_PATH/SampleCode as mapd_jdbc_example.py:

# !/usr/bin/env python
# Note: The following example should be run in the same directory as map_jdbc.py
# and mapdjdbc-1.0-SNAPSHOT-jar-with-dependencies.jar

import mapd_jdbc
import pandas
import matplotlib.pyplot as plt

dbname = 'mapd'
user = 'mapd'
host = 'localhost:9091'
password = 'HyperInteractive'

# Connect to the db

mapd_con = mapd_jdbc.connect(dbname=dbname, user=user, host=host, password=password)

# Get a db cursor

mapd_cursor = mapd_con.cursor()

# Query the db

query = "select carrier_name, avg(depdelay) as x, avg(arrdelay) as y from flights_2008 group by carrier_name"

mapd_cursor.execute(query)

# Get the results

results = mapd_cursor.fetchall()

# Make the results a pandas DataFrame

df = pandas.DataFrame(results)

# Make a scatterplot of the results

plt.scatter(df[1],df[2])

plt.show()

RJDBC

MapD Core supports R via RJDBC.

Simple example on local host

library(RJDBC)
drv <- JDBC("com.mapd.jdbc.MapDDriver",
            "/home/mapd/bin/mapd-1.0-SNAPSHOT-jar-with-dependencies.jar",
            identifier.quote="'")
conn <- dbConnect(drv, "jdbc:mapd:localhost:9091:mapd", "mapd", "HyperInteractive")
dbGetQuery(conn, "SELECT i1 FROM test1  LIMIT 11")
dbGetQuery(conn, "SELECT dep_timestamp FROM flights_2008_10k  LIMIT 11")

More complex example to remote machine

library(RJDBC)
drv <- JDBC("com.mapd.jdbc.MapDDriver",
            "/home/mapd/bin/mapd-1.0-SNAPSHOT-jar-with-dependencies.jar",
            identifier.quote="'")
conn <- dbConnect(drv,
                  "jdbc:mapd:colossus.mapd.com:9091:mapd",
                  "mapd",
                  "HyperInteractive")
dbGetQuery(conn,
  paste("SELECT date_trunc(month, taxi_weather_tracts_factual.pickup_datetime)",
        "  as key0,",
        "AVG(CASE WHEN 'Hyatt' = ANY",
        "    taxi_weather_tracts_factual.dropoff_store_chains THEN 1 ELSE 0 END)",
        "  AS series_1",
        "FROM taxi_weather_tracts_factual",
        "WHERE (taxi_weather_tracts_factual.dropoff_merc_x >= -8254165.98668337",
        "  AND taxi_weather_tracts_factual.dropoff_merc_x < -8218688.304677745)",
        "AND (taxi_weather_tracts_factual.dropoff_merc_y >= 4966267.65475399",
        "  AND taxi_weather_tracts_factual.dropoff_merc_y < 4989291.122013792)",
        "AND (",
        "  taxi_weather_tracts_factual.pickup_datetime",
        "    >= TIMESTAMP(0) '2009-12-20 08:13:47'",
        "  AND taxi_weather_tracts_factual.pickup_datetime",
        "    < TIMESTAMP(0) '2015-12-31 23:59:59')",
        "GROUP BY key0 ORDER BY key0", sep=" "))