Saving raw Numpy arrays in a database via SQLAlchemy

Posted on fre 04 oktober 2019 in Programming

I am currently implementing a database where I need to store large chunks of raw data in spreadsheet format. After some initial - slow - attempts with Pandas' to_sql method I looked around for a way to store the raw array in a Blob. I found out about the TypeDecorator class and inspired by the example at https://docs.sqlalchemy.org/en/13/core/custom_types.html#marshal-json-strings I created the following helper class.

import pickle

from sqlalchemy import Binary, TypeDecorator

class DBNumpyArray(TypeDecorator):
    """Represents a numpy array as a db BLOB

    Usage::

        DBNumpyArray(arr)

    """

    impl = Binary

    def process_bind_param(self, value, dialect):
        if value is not None:
            value = value.dumps()
        return value

    def process_result_value(self, value, dialect):
        if value is not None:
            value = pickle.loads(value)
        return value

A simple snippet that made wonders to my application!

To further improve on the applicability, portability and speed, one could imagine to use BytesIO and numpy.save instead of dumps and loads, but for me this was good enough!