SQLAlchemy with Dynamic Table Structures

I recently ran into a situation where I needed SQLAlchemy to connect to an existing database (well, actually, several hundred) with overlapping table names, but each table had user changeable schemas. Some tables had 20 columns and some had a few hundred. Defining the schema in models wasn’t going to work.

SQLAlchemy supports the ability to discover the schema, but most tutorials ended up with read only functionality as the Row objects were simply tuples. After much digging, it turns out SQLAlchemy can autodiscover the table structure while supporting CRUD operations. This is what the example code looks like:

import os

from sqlalchemy import MetaData, create_engine, Table, Column, Integer
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.automap import automap_base

class DBConnector():
    Instantiates a connection to a supplied database and supports CRUD operations
    def __init__(self, database_name):

        Establishes the connection parameters

        :param database_name: name of target database
        :type database_name: str
        self.mapped_tables = ["LIST OF TARGET TABLE NAMES"]
        #self.metadata = MetaData()
        self.Base = automap_base()
        self.db = create_engine(
            'mssql+pyodbc://%s:%s@%s/%s?driver=ODBC+Driver+18+for+SQL+Server&TrustServerCertificate=yes' % (
                os.getenv('DB_USER'), os.getenv('DB_PASSWORD'), os.getenv('SQL_HOST'), database_name))
        self.connection = self.db.connect()

        for table_name in self.mapped_tables:
            table_obj = Table(table_name, self.Base.metadata, Column('ID', Integer, primary_key=True), autoload=True, autoload_with=self.db)
            sqa_tbl = self.Base.classes[table_name]
            self.__setattr__(table_name, sqa_tbl)

        Session = sessionmaker(bind=self.db)
        self.session = Session()

So, a few important things on the above:

  • You need to know the tables you want to map. This could be updated as a variable. In my use case, it was the same across all couple hundred.
  • The most important thing is that you need to specify the primary key column otherwise automap_base will not instantiate the attributes. This is not necessary if the primary key is defined but for some reason in the DBs I was working in, they weren’t.
  • The above is not threadsafe so use separate processes or leverage a pool.
  • The above should probably have a __repr__ defined that helps identify the database it’s attached to.
  • To prevent namespace collisions, it may be beneficial to


TestDB = DBConnector('test_db')
result = TestDB.session.query(TestDB.TABLENAME).all()

Backing up a Volume from Amazon EC2 to Amazon S3 (Also, exporting an image for download)

If you’re having trouble using the traditional methods of exporting a volume out, you can still get it out using the following instructions:

  1. Start up a micro instance using a Linux flavor of your choice.
  2. On your target instance, shut it down and detach the volume.
  3. Attach it to your new micro instance as /dev/sdf
  4. Issue the following command:
    1. # dd if=/dev/sdf | gzip -1 – | pv | aws s3 cp – s3://bucketname/tgtname.gz
    2. Explanation of the above:
      1. dd reads the file bit for bit, we set the input to our block device
      2. We pipe it to gz to compress the byte stream
      3. pv allows us to monitor our progress (you’ll need to do yum install -y pv)
      4. “aws s3 cp -” is the amazon awscli s3 cp command with stdin (-) as the source.

This will result in an exported raw image that you can then download. Keep in mind that this may violate the ToS so read that prior to going forward with the above.