SQLAlchemy with Dynamic Table Structures

I recently ran into a situation where I needed SQLAlchemy to connect to an existing database (well, actually, several hundred) with overlapping table names, but each table had user changeable schemas. Some tables had 20 columns and some had a few hundred. Defining the schema in models wasn’t going to work.

SQLAlchemy supports the ability to discover the schema, but most tutorials ended up with read only functionality as the Row objects were simply tuples. After much digging, it turns out SQLAlchemy can autodiscover the table structure while supporting CRUD operations. This is what the example code looks like:

import os

from sqlalchemy import MetaData, create_engine, Table, Column, Integer
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.automap import automap_base

class DBConnector():
    """
    Instantiates a connection to a supplied database and supports CRUD operations
    """
    def __init__(self, database_name):

        """
        Establishes the connection parameters

        :param database_name: name of target database
        :type database_name: str
        """
        self.mapped_tables = ["LIST OF TARGET TABLE NAMES"]
        #self.metadata = MetaData()
        self.Base = automap_base()
        self.db = create_engine(
            'mssql+pyodbc://%s:%s@%s/%s?driver=ODBC+Driver+18+for+SQL+Server&TrustServerCertificate=yes' % (
                os.getenv('DB_USER'), os.getenv('DB_PASSWORD'), os.getenv('SQL_HOST'), database_name))
        self.connection = self.db.connect()

        for table_name in self.mapped_tables:
            table_obj = Table(table_name, self.Base.metadata, Column('ID', Integer, primary_key=True), autoload=True, autoload_with=self.db)
            self.Base.prepare()
            sqa_tbl = self.Base.classes[table_name]
            self.__setattr__(table_name, sqa_tbl)

        Session = sessionmaker(bind=self.db)
        self.session = Session()

So, a few important things on the above:

  • You need to know the tables you want to map. This could be updated as a variable. In my use case, it was the same across all couple hundred.
  • The most important thing is that you need to specify the primary key column otherwise automap_base will not instantiate the attributes. This is not necessary if the primary key is defined but for some reason in the DBs I was working in, they weren’t.
  • The above is not threadsafe so use separate processes or leverage a pool.
  • The above should probably have a __repr__ defined that helps identify the database it’s attached to.
  • To prevent namespace collisions, it may be beneficial to

Usage:

TestDB = DBConnector('test_db')
result = TestDB.session.query(TestDB.TABLENAME).all()

Backing up a Volume from Amazon EC2 to Amazon S3 (Also, exporting an image for download)

If you’re having trouble using the traditional methods of exporting a volume out, you can still get it out using the following instructions:

  1. Start up a micro instance using a Linux flavor of your choice.
  2. On your target instance, shut it down and detach the volume.
  3. Attach it to your new micro instance as /dev/sdf
  4. Issue the following command:
    1. # dd if=/dev/sdf | gzip -1 – | pv | aws s3 cp – s3://bucketname/tgtname.gz
    2. Explanation of the above:
      1. dd reads the file bit for bit, we set the input to our block device
      2. We pipe it to gz to compress the byte stream
      3. pv allows us to monitor our progress (you’ll need to do yum install -y pv)
      4. “aws s3 cp -” is the amazon awscli s3 cp command with stdin (-) as the source.

This will result in an exported raw image that you can then download. Keep in mind that this may violate the ToS so read that prior to going forward with the above.

Wkhtmltopdf PRE Formatting Issue

On Ubuntu 14.04 LTS, wkhtmltopdf was not outputting <pre> formatted sections of HTML the same way the browser would. The easiest fix is to install tts-mscorefonts:

sudo apt-get install ttf-mscorefonts-installer

Retry and it should work fine.

Logstash-Forwarder Logs Flooded with “Socket error, will reconnect…broken pipe”

I was getting a flood of errors on some of my logstash-forwarding machines today:

Feb 23 17:08:43 lg1 :43-08:00 ash-forwarder[22493]: 2015/02/23 17:08:43.184108 Registrar received 5 events
Feb 23 17:08:49 lg1 :49-08:00 ash-forwarder[22493]: 2015/02/23 17:08:49.625239 Socket error, will reconnect: write tcp ipaddress:5000: broken pipe
Feb 23 17:08:50 lg1 :50-08:00 ash-forwarder[22493]: 2015/02/23 17:08:50.625656 Setting trusted CA from file: /etc/pki/tls/certs/logstash-forwarder.crt
Feb 23 17:08:50 lg1 :50-08:00 ash-forwarder[22493]: 2015/02/23 17:08:50.627188 Connecting to ipaddress:5000 (ipaddress) 
Feb 23 17:08:50 lg1 :50-08:00 ash-forwarder[22493]: 2015/02/23 17:08:50.682474 Connected to ipaddress
Feb 23 17:08:50 lg1 :50-08:00 ash-forwarder[22493]: 2015/02/23 17:08:50.683530 Registrar received 5 events

With this repeating over and over every second. It turns out that one old certificate on one shipping machine can reset all of the other machines’ connections. I found the old cert on one of the servers, added the new one and restarted the logstash-forwarder service, and all the messages went away.

Source: https://github.com/elasticsearch/logstash-forwarder/issues/160

LibreOffice Base: Connecting to MSSQL in Linux

First, download the SQL JDBC Driver from Microsoft.

  1. Extract the JDBC drivers somewhere. There’s a series of folders and a Jar file that we’ll be getting.
  2. Once you’ve downloaded that, open LibreOffice Calc or Writer.
  3. Go to Options->AdvancedScreenshot from 2015-02-23 13:38:37
  4. Click Class Path on the right and choose Add Archive from below:Screenshot from 2015-02-23 13:38:46
  5. Navigate to your file and choose sqljdbc4.jar
  6. Restart LO and you should be ready to connect.
  7. Open LO Base
  8. Choose Connect to an Existing Database, Type is JDBCScreenshot from 2015-02-23 13:48:01
  9. In the Datasource URL:
    1. sqlserver://ipaddress:1433;databaseName=DBNAMEHERE
  10. In the JDBC Driver Class:
    1. com.microsoft.sqlserver.jdbc.SQLServerDriver
  11. Enter your credentials and test, you should be good to go!

KVM: Getting Bonding, Bridges, VLANs and Macvtap Playing Nice Together

I finally made the jump from using a Physical Ethernet -> VLAN -> Bridge stack to Physical Ethernet Devices -> Bond -> VLANs ->MacvTap. Notice bridges are not included, that’s because they don’t work with Linux bonding, you need to use Macvtap interfaces.

Here’s a sample configuration (you’ll need to install ifenslave & vlan if you haven’t already):

# The loopback network interface
auto lo
iface lo inet loopback

#Intel Interface #1
auto p4p1
iface p4p1 inet manual
    bond-master bond0

#Intel Interface #2
auto p4p2
iface p4p2 inet manual
    bond-master bond0

#Onboard Intel NIC
auto eth0
iface eth0 inet manual
    bond-master bond0

#Primary Bond Interface
auto bond0
iface bond0 inet manual
bond-miimon 100 # Specifies the MII link monitoring frequency in milliseconds. This determines how often the link state of each slave is inspected for link failures.
bond-downdelay 200 # Specifies the time, in milliseconds, to wait before disabling a slave after a link failure has been detected.
bond-updelay 200 # Specifies the time, in milliseconds, to wait before enabling a slave after a link recovery has been detected.
bond-mode 0 # round robin (think of it as network raid 0)
bond-slaves none # defined in the interfaces above with bond-master

#DMZ VLAN
auto bond0.11 # this sets a vlan tag of 11 for all traffic on this interface
iface bond0.11 inet manual
    vlan-raw-device bond0

So, the above will aggregate three NICs and round robin packets across the three interfaces to get the combined output of the three. Once you’ve setup the above, you can go into virt-manager and add a Macvtap interface paired to either the bond0 interface or one of you VLAN bonded interfaces.

I set mine as virtio and mode of bridge. This allows other guests in the same vlan to communicate within the host.

High CPU Usage in QEMU/KVM for Windows 7/8/2008 R2 Guest

It may be QXL. If you have SPICE graphics enabled (probably affects VNC too), check to see if you’re using QXL. Changing it from QXL to VMware’s VMVGA took idle CPU usage of 10%-30% to 3-7%.

Virt-Manager has a bug in it that won’t let you change it from QXL, so go to the command line and type in:

virsh edit domainname

Go down to to the video section and update the model type to be:

 

Shutdown and start the Windows guest again and you should be good to go.

Remove Failing Disk from MDADM Array

If you need to stop a resync of an MD array, the easiest way to do it is:

echo “idle” > /sys/block/md0/md/sync_action && mdadm –manage /dev/md0 –fail /dev/sda1

Revise the above based on the number of your array and the device that you want to mark as failed. Without the &&, the resync will resume right away. The above will stop the sync and mark the other disk as failed, which can be fixed later.

 

KVM: Configure Mirrored Port’s Traffic to Be Visible in Guest (SNORT)

This one was tricky. If you want SNORT to run as a guest, getting the port mirrored traffic to it wasn’t exactly trivial.

My Host Setup:

NIC 1: Handles all bridges and VLANs

NIC2: Dedicated to SNORT, plugged directly into mirrored port on switch.

 

Before we start, make sure your SNORT guest is off and the bridge interface that you want to use for packet sniffing is down. Getting the packets to flow properly requires a very specific sequence and failure to do that will only result in frustration and confusion.

First Step (on KVM Host):

Edit /etc/network/interfaces to reflect for NIC2 (eth3 in my case):

auto eth3
iface eth3 inet manual

#SNORT Interface
auto br3
iface br3 inet manual
bridge_ports eth3
bridge_stp off

Bring up br3

ifup br3

I’ll need to test it, but you may ensure the host doesn’t get an address on that interface by adding in a line: address 0.0.0.0

Now, the tricky part is to get the traffic to show in the guest. On the host, just to make sure your port mirroring is working properly, do:

sudo tcpdump -i br3

You should see a ton of traffic if configured properly.

Now, to get the traffic not destined for the host to go to the guest (SNORT), type in:

brctl setageing br3 0
brctl setfd br3 0

Now, if you boot up your guest (it should have two interfaces), you should be able to tcpdump -i eth# and it will show the same large amount of traffic. The above is manual and will not persist across reboots.

To make it persistent in Ubuntu (replace br3 with your bridge interface name):

cd /etc/network/if-up.d

touch br3-mirror

chmod +x br3-mirror

nano br3-mirror

#!/bin/bash
if [ "$IFACE" = br3 ]; then
brctl setageing br3 0
brctl setfd br3 0
fi

 

Logstash: Troubleshooting High CPU Usage

If you’re see high CPU usage that won’t go away on your logstash server, and your top looks like this:

Screenshot from 2015-01-14 18:21:01

It may be due to a misbehaving config file. My logstash was constantly respawning when I tailed /var/log/syslog, constantly outputting:

Jan 14 18:12:34 lg1 kernel: [ 1105.103887] init: logstash main process (7480) terminated with status 1
Jan 14 18:12:34 lg1 kernel: [ 1105.103894] init: logstash main process ended, respawning
Jan 14 18:12:40 lg1 kernel: [ 1110.928650] init: logstash main process (7508) terminated with status 1
Jan 14 18:12:40 lg1 kernel: [ 1110.928656] init: logstash main process ended, respawning
Jan 14 18:12:46 lg1 kernel: [ 1116.815068] init: logstash main process (7536) terminated with status 1
Jan 14 18:12:46 lg1 kernel: [ 1116.815079] init: logstash main process ended, respawning
Jan 14 18:12:52 lg1 kernel: [ 1122.919407] init: logstash main process (7567) terminated with status 1
Jan 14 18:12:52 lg1 kernel: [ 1122.919414] init: logstash main process ended, respawning
Jan 14 18:12:58 lg1 kernel: [ 1128.908131] init: logstash main process (7596) terminated with status 1
Jan 14 18:12:58 lg1 kernel: [ 1128.908141] init: logstash main process ended, respawning
Jan 14 18:13:04 lg1 kernel: [ 1134.837492] init: logstash main process (7625) terminated with status 1
Jan 14 18:13:04 lg1 kernel: [ 1134.837504] init: logstash main process ended, respawning
Jan 14 18:13:10 lg1 kernel: [ 1140.914141] init: logstash main process (7655) terminated with status 1
Jan 14 18:13:10 lg1 kernel: [ 1140.914148] init: logstash main process ended, respawning
Jan 14 18:13:15 lg1 kernel: [ 1145.703058] init: logstash main process (7685) terminated with status 1
Jan 14 18:13:15 lg1 kernel: [ 1145.703063] init: logstash main process ended, respawning

To figure out if it’s a bad config file, stop the logstash service (service logstash stop) and manually start it with:

sudo -u logstash /opt/logstash/bin/logstash agent -f /etc/logstash/conf.d/ -l /var/log/logstash/logstash.log

if you tail the log, you may see something along the lines of this:

:message=>"+---------------------------------------------------------+\n| An unexpected error occurred. This is probably a bug.   |\n| You can find help with this problem in a few places:    |\n|                                                         |\n| * chat: #logstash IRC channel on freenode irc.          |\n|     IRC via the web: http://goo.gl/TI4Ro                |\n| * email: logstash-users@googlegroups.com                |\n| * bug system: https://logstash.jira.com/                |\n|                                                         |\n+---------------------------------------------------------+\nThe error reported is: \n  Couldn't find any filter plugin named 'file'. Are you sure this is correct? Trying to load the file filter plugin resulted in this error: no such file to load -- logstash/filters/file"}

The key word is that “Couldn’t find any filter plugin named ‘file’.” If you check your config files in /etc/logstash/conf.d, in this particular case I was trying to load a file filter but Logstash didn’t have any filters for the type file.

The quick fix is to remove the offending config file, restart the logstash service and see if the CPU usage goes back down.