pymapd
The pymapd client interface provides a python DB API 2.0-compliant MapD interface. In addition, it provides methods to get results in the Apache Arrow-based GDF format for efficient data interchange.
Documentation
See the GitHub pymapd repository for full documentation:
Examples
Install Miniconda
Follow these instructions to prepare, install and configure Miniconda.
- Preparation
- Optionally
pip
uninstall any versions of pymapd and pygdf. - Optionally delete your ~/miniconda directory.
- Optionally
- Install and configure miniconda.
wget https://repo.continuum.io/miniconda/Miniconda2-latest-Linux-x86_64.sh bash Miniconda2-latest-Linux-x86_64.sh source ~/.bash_profile conda create -n myenv3 python=3 source activate myenv3 conda install -c conda-forge -c gpuopenanalytics/label/dev pygdf conda install -c conda-forge -c gpuopenanalytics/label/dev pymapd conda install cudatoolkit
Create a Cursor and Execute a Query
Create a connection.
>>> from pymapd import connect >>> con = connect(user="mapd", password= "HyperInteractive", host="my.host.com", dbname="mapd") >>> con Connection(mapd://mapd:***@my.host.com:9091/mapd?protocol=binary)
Create a cursor.
>>> c = con.cursor() >>> c <pymapd.cursor.Cursor object at 0x7f0117fe2490>
Query database table of flight departure and arrival delay times.
>>> c.execute("SELECT depdelay, arrdelay FROM flights LIMIT 100") <pymapd.cursor.Cursor object at 0x7f0117fe2490>
Display number of rows returned.
>>> c.rowcount 100
Display the Description objects list.
The list is a named tuple with attributes required by the specification. There is one entry per returned column, and we fill the name
, type_code
, and null_ok
attributes.
>>> c.description
[Description(name=u'depdelay', type_code=0, display_size=None, internal_size=None, precision=None, scale=None, null_ok=True), Description(name=u'arrdelay', type_code=0, display_size=None, internal_size=None, precision=None, scale=None, null_ok=True)]
Iterate over the cursor, returning a list of tuples of values.
>>> result = list(c)
>>> result[:5]
[(1, 14), (2, 4), (5, 22), (-1, 8), (-1, -2)]
Select Data into a GpuDataFrame Provided by pygdf
This example shows a pygdf query on a local MapD instance.
(myenv3) me@there:~$ python Python 3.6.2 |Anaconda, Inc.| (default, Oct 5 2017, 07:59:26) [GCC 7.2.0] on linux Type "help", "copyright", "credits" or "license" for more information.
Connect to the MapD server.
>>> from pymapd import connect >>> con = connect(user="mapd", password= "HyperInteractive", host="localhost", dbname="mapd")
Execute a SQL query.
>>> con.execute("SELECT depdelay, arrdelay FROM flights_2008_10k LIMIT 100")
Use the cursor object to examine the results.
>>> c = con.cursor() >>> c.execute("SELECT depdelay, arrdelay FROM flights_2008_10k LIMIT 100") <pymapd.cursor.Cursor object at 0x7f2453ea77b8> >>> c.rowcount 100 >>> import pygdf as pg >>> df = con.select_ipc_gpu("SELECT depdelay, arrdelay FROM flights_2008_10k LIMIT 100") >>> df.head() depdelay arrdelay 0 -2 -7 1 0 -16 2 21 2 3 27 19 4 0 -3 >>>
Note |
pygdf select_ipc_ calls are for sharing the same chunk of memory and results
between MapD and Python on the same server. If you need to move results over
the network, use execute() calls instead. |