Quadcap Software

Quadcap Embeddable Database

Quadcap Software

Developing With QED

The Quadcap Embeddable Database implements SQL92 and JDBC 2.0.

Connecting to a database

In order to access a database, you need to obtain a JDBC Connection object. There are two basic ways to get a database connection:

Using the JDBC DriverManager interface, you can directly obtain a JDBC Connection, if you know:

  • The name of the JDBC Driver class (com.quadcap.jdbc.JdbcDriver)
  • The database URL (jdbc:qed:database-name)
  • Any connection parameters (see below)

For QED, you could obtain a database connection using code similar to the following:

Class.forName("com.quadcap.jdbc.JdbcDriver");
java.sql.Connection conn = 
    java.sql.DriverManager.getConnection("jdbc:qed:mydb");
	

or, to create a new database:

Class.forName("com.quadcap.jdbc.JdbcDriver");
java.util.Properties p = new java.util.Properties();
p.setProperty("create", "true");
java.sql.Connection conn = 
    java.sql.DriverManager.getConnection("jdbc:qed:mynewdb", p);
	

another way to create a new database:

Class.forName("com.quadcap.jdbc.JdbcDriver");
java.sql.Connection conn = 
    java.sql.DriverManager.getConnection("jdbc:qed:mynewdb;create=true");
	

Connection properties

NameDefaultDescription
mode from Config: qed.defaultMode Read-only or read-write access to the database:
  • rw: Read-write (the default)
  • r: Read-only, intended for deployment on read-only media
isCaseSensitive from Config: qed.isCaseSensitive A boolean flag specifying whether string (VARCHAR) items are case-sensitive.
create false If "true", the database will be created if it doesn't already exist.
blockSize from Config: qed.blockSize The database block size, which must be a power of two. Maximum size: 32768, Minimum recommended size: 4096.
cacheSize from Config: qed.cacheSize The database cache size. A certain number of blocks are cached in an LRU cache; this parameter controls the size of this cache.
scratchCacheSize from Config: qed.scratchCacheSize The database scratch cache size, used for caching access to the database temporary (or "scratch") file. A certain number of blocks are cached in an LRU cache; this parameter controls the size of this cache.
useLockFile true A boolean. If true, a lock file will be used to protect from database corruption which might result from multiple processes accessing the same database simultaneously.
force false QED uses a lockfile to prevent multiple JVMs from accessing the database files simultaneously, which could lead to database corruption. When the JVM exits normally, this lockfile is deleted. If the JVM exits abnormally the lockfile may be left behind, and subsequent attempts to access the database will fail because of the old lockfile. If force is true, the database connection will succeed even if the lockfile is present. This option should be used with extreme care, since if there really is another process accessing the database, corruption may result.
logger 2 If no transaction logging is required (i.e., no rollback or recovery), then use '0'. Otherwise, use '2', (the default)
scratchDir database directory Specify the directory where scratch files are written. This option is most useful in conjunction with mode=r, where the database files can be located on read-only media, and the scratch directory can be located separately.
fastSync true If true, omit time-consuming sync operations to permit greater throughput.
minSyncInterval from config:qed.minSyncInterval Minimum interval between syncs. Set to zero to ensure a sync after every transaction.
maxSyncInterval from config:qed.maxSyncInterval Maximum interval between syncs. Use this to help limit the logfile, scratch, and before-images files.
minLogSize 128 K bytes For loggers which support rollback, the minimum size at which a logging checkpoint operation will occur. The logging checkpoint cleans the log of entries which correspond to completed transactions.
qed.deletedRowCache 1000 Maximum number of deleted rows before we go to disk.
encrypt false On database creation (i.e., datafile doesn't yet exist and 'create=true'), if encrypt == true, then a low-level block encryption algorithm (the default is Rijndael) will be selected, seeded with the values of the user and passwd properties for this, connecting, user. The database magic code will be changed from 04 0c to 0e 0c to signal the encrypted content.

Config

Overall server configuration information is also managed through the Config "service", which can be controlled using the file config.props. The following configuration variables are available in that file:

NameDefaultDescription
debug.file stderr Specify the name of the debug log file. The default value, stderr, causes debug information to be written to System.err.
debug.level 0 Specify the level of debugging output. Level zero, the lowest level, essentially turns debug output off (though errors/exceptions will still be reported.) Increasing levels will result in more debug output. The maximum debug level is five (5).
qed.blockSize 8192 The database block size.
qed.cacheSize 256 blocks The database cache size. A certain number of blocks are cached in an LRU cache; this parameter controls the size of this cache. 32 is probably a the minimum for a single-threaded application; more threads increase cache requirements.
qed.scratchCacheSize 32 blocks The "temporary" file cache size. A certain number of blocks in the scratch file (used for temporary tables and indexes) are cached in an LRU cache; this parameter controls the size of this cache.
qed.defaultMode rw The default database 'mode'. Either 'r' or 'rw'
qed.isCaseSensitive true A boolean flag specifying whether string (VARCHAR) items are case-sensitive.
qed.minSyncInterval 15000 (15 seconds) The minimum interval (in ms) between database sync operations.
qed.maxSyncInterval 60000 (60 seconds) The maximum interval (in ms) between database sync operations. Use this to help limit the logfile, scratch, and before-images files.

Debug builds of the server can generally support various levels of trace output configured by the Config system:

NameDefaultDescription
qed.trace.JdbcDriver 0
bit 0: API execution
bit 1: Database open/close
qed.trace.Connection 0
bit 0: Connection lifecycle
bit 1: Connection methods
qed.trace.Statement 0
bit 0: statement lifecycle
bit 1: statement execution
bit 2: statement + result
bit 4: params in prepared statements
qed.trace.ResultSet 0
bit 0: API execution
bit 1: findColumn
bit 2: rows
bit 3: row + column -> object
bit 5: updateObject(row, col, obj)
qed.trace.file 0
bit 0:  Block
bit 1:  Block verbose
bit 2:  BlockAccess
bit 3:  BlockAccess verbose
bit 4:  BlockFile
bit 5:  BlockFile verbose
bit 9:  Log: blobs
bit 11: SubPageManager
bit 12: Datafile
bit 13: CircularBuffer
bit 14: CircularBuffer streams
bit 15: LogSync handle POST
bit 16: Log1.reallyAddEntry()
bit 17: LogSync handle PRE
bit 18: Log restart
bit 19: Log1 save/restore blocks
bit 20: BlockStore.write
bit 21: Log1.reallyCheckpoint
bit 22: Logger1 checkpoint
bit 23: Logger1 checkpoint verbose
bit 24: sync()
bit 25: checkpoint
qed.trace.index 0
bit 0: Btree
bit 2: Bnode
bit 3: BnodeEnumeration
bit 4: Comparator
bit 5: Bnode splits
bit 6: BtreeCursor seek
bit 7: BtreeCursor insert
bit 8: BtreeCursor replace
bit 9: BtreeCursor delete
bit 10: BtreeCursor movement normal
bit 11: BtreeCursor movement exceptional
bit 12: BtreeCursor show keys only
qed.trace.locks 0
bit 0: verbose lock tracing
bit 1: lock tracing
bit 2: really verbose lock tracing
bit 3: transactions
bit 4: dump locktable every 30 seconds
qed.trace 0
bit 0:  add/remove relation, dependencies
bit 1:  Verbose exceptions
bit 2:  Verbose cursor creation
bit 3:  ~Verbose data
bit 4:  predicate evaluation
bit 5:  index constraints on queries
bit 6:  key comparison
bit 7:  row value operations
bit 8:  Connection open/close
bit 9:  end/commit/rollback trans/stmt/rs
bit 10: session: statement contexts
bit 11: table row operations
bit 12: table row stream operations
bit 13: TableOp.getRow
bit 14: TableOp.insertRow, updateRow
bit 15: LogStep.redo
bit 16: LogStep.undo
bit 17: Join temporary index creation

Security

QED is an embedded database using the local filesystem rather than the network. At this level, for most embedded applications, filesystem security is enough. This is why QED doesn't normally use the JDBC password, and doesn't use the authentication id for anything other than the initial SQL schema name.

With the release of QED 3.1, you can now encrypt the database using a key formed by concatenating the JDBC user and password. Your data is then securely protected with encryption using the new AES (Rijndael) standard. The encryption algorithm for 3.1 uses a 128 bit key. Performance testing indicates a loss of about 10% on the Fill test and a similar amount on the 'TpcB' test when encryption is enabled.

Memory Footprint

QED is architected for low memory consumption. This means that generally, we use streams instead of arrays wherever possible, and that we use disk-based structures (taking advantage of the block cache somewhat) instead of in-memory structures for things like temporary indexes and orderings.

Java doesn't make it very easy to say exactly how much memory your application uses, because the numbers you get from inside the VM don't match the numbers when you examine the external process behavior of the VM itself. That said, here's what I can report about QED's memory footprint under at least two VMs:

Methodology: I simply ran each of the java programs while watching the dynamic system monitor and looking at the VM size of the process running the VM. I looked for the peak size, which was generally pretty much the same as the size 15 seconds later (during the final sleep which I put in each testcase)

I used this class to measure the base footprint of the VM itself:

public class foo {
    public static void main(String[] args) {
        try {
            Runtime r = Runtime.getRuntime();
            System.out.println("footprint: " + 
                               (r.totalMemory() - r.freeMemory()) +
                               " / " + r.totalMemory());
            Thread.sleep(30000);
        } catch (Throwable t) {}
    }
}
      

Firing up the VM itself and a minimal set of classes has a base overhead:

  Base VM: (as measured by foo.java)
      Linux JDK 1.4.1:       external:  9288K, internal:  261K
      Windows 2000 JDK 1.4.1 external:  5732K, internal:  251K

  XmlLoad getConnection (create Database)
      Linux JDK 1.4.1        external: 14060K, internal:  677k
      Windows 2000 JDK 1.4.1 external:  8140K, internal:  585K

  XmlLoad loadDatabase complete
      Linux JDK 1.4.1        external: 15060K, internal: 1211K
      Windows 2000 JDK 1.4.1 external:  8864K, internal:  698K

  QED footprint (XmlLoad max footprint - Base VM footprint)
      Linux Sun JDK1.4.1     external:  5772K, internal:  950K
      Windows 2000 JDK 1.4.1 external:  3132K, internal:  447K
	

Performance

QED is designed to perform very well with most typical SQL operations, given a few considerations:

Indexes:

Creating the proper indexes makes all the difference on some queries. QED will use indexes that have columns which match those used in the query/update. QED will create (and delete) temporary indexes as necessary for inner join operations where the requisite index doesn't exist in the inner table.

Non-indexable queries

Some queries, like those using 'LIKE' or using functions, cross joins, etc., aren't indexable and use essentially brute-force approaches. This may be OK if your data is limited, but will fail badly with large datasets.

Joins

QED has a very simple join plan. Tables are joined left to right, with the left table being the outer, the right table being the inner table, in a series of nested inner loop INNER JOINs wherever possible. Any kind of equijoin or join on columns will use this approach. Failing a common column, we'll resort to a cross join, which is a full cartesian product. The inner table in the cross join is iterated for every row of the outer, leading to possibly very long run times.

For INNER joins, performance should be very good if the inner column is indexed already. (I.e., consider adding an index in this situation.)

Concurrency

QED fully supports concurrent access, while maintaining SERIALIZABLE isolation and ACID properties. QED's LockManager supports a hierarchical lock tree which uses multiple lock modes to permit multiple readers and a single writer to each database structure. Locking is performed at the table level. Table locking implies that sometimes programs will block, waiting for a table lock, if it's in use by other transactions in an "inconsistent" mode. Table locking is also (as with any two-phase locking approach) subject to deadlock. QED inelegantly resolves this using a configurable "lock timeout" parameter.

It's possible (and highly recommended) to avoid deadlock in the application, by ensuring that table locks are acquired in a canonical order, thus preventing the deadlock condition.

Keeping transactions short is another key under a table locking regime. A long-running transaction clogs up the log, blocks other transactions, and should generally be avoided.

In general, these limitations related to concurrency are the result of a conscious design compromise: QED's target architecture isn't designed to maximize concurrent performance. Rather, the objective is to be small and fast for typical (i.e., single user) access, while permitting concurrent access for carefully designed or small scale applications.

Tuning

The single biggest tuning knob is the 'cacheSize' connect property. Increasing the cache size can help quite a bit it some applications -- some experimentation may be necessary to find the best size for your application. Depending on your I/O system, you may benefit from a different (the default is 8K) block size; other supported block sizes are 4K and 16K.

Speed freaks

For single user applications where ACID doesn't really matter (i.e., you're bulk-loading a database from some other source, and if the load fails, you'll just re-run it), you can speed things up by getting rid of the transaction logger using the connect parameter 'logger=0'

Administration

QED requires zero administration. Still, sometimes you *want* to administer your data -- with QED, the database is simply a directory in the filesystem containing files accessed via a JDBC url using the QED JDBC driver. If you want, you can use file system backups (ideally when the datbase is closed, but even a snapshot of a running database should be recoverable using QED's restart logic.) to backup your database.

For more direct control over the database (as well as for a smaller backup file), QED includes the utility classes XmlDump, XmlLoad. These classes can be run from the command line:

# my handy java+qed alias :-)
alias qjava=java -classpath $QED/lib/qed.jar:$QED/lib/antlr.jar
    
# dump the db to a compressed xml file
qjava -Djdbc.url=jdbc:qed:mydb com.quadcap.sql.tools.XmlDump db.xml.gz

# reload the db from the file
rm -rf mydb
qjava '-Djdbc.url=jdbc:qed:mydb;create=true' \
      com.quadcap.sql.tools.XmlLoad db.xml.gz
      

You can also call XmlDump/XmlLoad from your application:

import java.io.*;
import java.sql.Connection;
import com.quadcap.sql.tools.XmlDump;

void dumpDb(Connection conn, String outfile) {
    FileOutputStream fos = new FileOutputStream(outfile);
    OutputStreamWriter ow = new OutputStreamWriter(fos);
    BufferedWriter bw = new BufferedWriter(ow);
    XmlDump dump = new XmlDump(conn);
    dump.dumpTables(bw);
    bw.close();
}
      

SQL Console

Many people have asked about a SQL consoles for QED. There are various options available here. First, a pretty nice JSP implementation of a SQL console, with support for DatabaseMetaData, is included with QWS. This web application will show you all of the QED databases currently open in this JVM, and allow you to connect the SQL console to each database. (You can also create, backup, and load from backup using the QED "admin" webapp.)

SQuirreL-SQL is a free, open-source SQL console that works very well with QED. It's my personal favorite -- easy to install, configure, and use.

iSQL-Viewer is another free, open-source SQL console with many powerful features. It has a steeper learning curve than SQuirrel-SQL, but it may be worth it if you need the extra features.

DbVisualizer is a free (though not open source) tool that works well with QED.

Fragmentation

QED reclaims space used by deleted objects, but this space is "locked up" inside the database's data file until new objects are created. It would be possible to move the remaining objects around to allow the file to be truncated, but this is a somewhat scary operation, and isn't likely to be implemented soon, if ever.

However, the same effect can be achieved pretty easily by using XmlDump and XmlLoad to dump the database and then reload it without the holes.

QED database migration

Generally, database formats are pretty closely tied to the particular QED version that generated them. So, you can't access a QED 3.0 database using QED 3.1. You can use the XmlDump + XmlLoad trick to convert the old database to the new format.

The tricky part of this is that there are bugs in pre-3.0 versions of XmlDump which prevent this from working in the expected way.

The solution is to use the new "multidriver" class, com.quadcap.jdbc.MultiDriver. This is a special JDBC driver which uses its own classloader to load old QED drivers. The JDBC URL used by this driver is just like a regular QED URL, with the subprotocol of mqed, and the additional connection property qed, which specifies the path to the old QED jar file.

For example, if the original database URL were:

jdbc:qed:mydb;user=joe

The MultiDriver url to access this database would be something like:

jdbc:mqed:mydb;user=joe;qed=/home/qed-2.2/lib/qed.jar

This is admittedly a little funky, but it's a simple solution to an infrequent yet important problem.