|
Quadcap Embeddable Database
Quadcap Software
|
Developing With QED
The Quadcap Embeddable Database implements SQL92 and JDBC 2.0.
Connecting to a database
In order to access a database, you need to obtain a JDBC
Connection object. There are two basic ways to
get a database connection:
Using the JDBC DriverManager interface, you can directly
obtain a JDBC Connection, if you know:
- The name of the JDBC Driver class (
com.quadcap.jdbc.JdbcDriver)
- The database URL (
jdbc:qed:database-name)
- Any connection parameters (see below)
For QED, you could obtain a database connection using code similar
to the following:
Class.forName("com.quadcap.jdbc.JdbcDriver");
java.sql.Connection conn =
java.sql.DriverManager.getConnection("jdbc:qed:mydb");
or, to create a new database:
Class.forName("com.quadcap.jdbc.JdbcDriver");
java.util.Properties p = new java.util.Properties();
p.setProperty("create", "true");
java.sql.Connection conn =
java.sql.DriverManager.getConnection("jdbc:qed:mynewdb", p);
another way to create a new database:
Class.forName("com.quadcap.jdbc.JdbcDriver");
java.sql.Connection conn =
java.sql.DriverManager.getConnection("jdbc:qed:mynewdb;create=true");
Connection properties
| Name | Default | Description |
| mode |
from Config: qed.defaultMode |
Read-only or read-write access to the database:
rw: Read-write (the default)
r:
Read-only, intended for deployment on read-only media
|
| isCaseSensitive |
from Config: qed.isCaseSensitive |
A boolean flag specifying whether string (VARCHAR)
items are case-sensitive.
|
| create |
false |
If "true", the database will be
created if it doesn't already exist.
|
| blockSize |
from Config:
qed.blockSize |
The database block size, which must be a
power of two.
Maximum size: 32768, Minimum recommended size: 4096.
|
| cacheSize |
from Config:
qed.cacheSize |
The database cache size. A certain number of blocks
are cached in an LRU cache; this parameter controls the size
of this cache. |
| scratchCacheSize |
from Config:
qed.scratchCacheSize |
The database scratch cache size, used for caching access
to the database temporary (or "scratch") file.
A certain number of blocks are cached in an LRU cache;
this parameter controls the size
of this cache. |
| useLockFile |
true |
A boolean. If true, a lock file will
be used to protect from database corruption which might
result from multiple processes accessing the same database
simultaneously. |
| force |
false |
QED uses a lockfile to prevent multiple
JVMs from accessing the database files
simultaneously, which could lead to database
corruption. When the JVM exits normally, this
lockfile is deleted. If the JVM exits abnormally
the lockfile may be left behind, and subsequent
attempts to access the database will fail because
of the old lockfile. If force is
true, the database connection will
succeed even if the lockfile is present.
This option should be used with extreme care,
since if there really is another process accessing
the database, corruption may result. |
| logger |
2 |
If no transaction logging is required
(i.e., no rollback or recovery), then use '0'.
Otherwise, use '2', (the default) |
| scratchDir |
database directory |
Specify the directory where scratch files are written. This
option is most useful in conjunction with mode=r,
where the database files can be located on read-only media,
and the scratch directory can be located separately.
|
| fastSync |
true |
If true, omit time-consuming
sync operations to permit greater throughput.
|
| minSyncInterval |
from config:qed.minSyncInterval |
Minimum interval between syncs. Set to zero to
ensure a sync after every transaction. |
| maxSyncInterval |
from config:qed.maxSyncInterval |
Maximum interval between syncs. Use this to help
limit the logfile, scratch, and before-images files.
|
| minLogSize |
128 K bytes |
For loggers which support rollback, the minimum size
at which a logging checkpoint operation will occur. The logging
checkpoint cleans the log of entries which correspond to completed
transactions. |
| qed.deletedRowCache |
1000 |
Maximum number of deleted rows before we go to disk.
|
| encrypt |
false |
On database creation (i.e., datafile doesn't yet
exist and 'create=true'), if
encrypt == true, then a low-level
block encryption algorithm (the default is Rijndael)
will be selected, seeded with the values of the
user and passwd properties for
this, connecting, user. The database magic code will
be changed from 04 0c to 0e 0c
to signal the encrypted content.
|
Config
Overall server configuration information is also managed through
the Config "service", which can be controlled using the
file config.props. The following configuration variables
are available in that file:
| Name | Default | Description |
| debug.file |
stderr |
Specify the name of the debug log file. The default
value, stderr, causes debug information to
be written to System.err. |
| debug.level |
0 |
Specify the level of debugging output. Level zero,
the lowest level, essentially turns debug output off (though
errors/exceptions will still be reported.) Increasing levels
will result in more debug output. The maximum debug level is
five (5). |
| qed.blockSize |
8192 |
The database block size. |
| qed.cacheSize |
256 blocks |
The database cache size. A certain number of blocks
are cached in an LRU cache; this parameter controls the size
of this cache. 32 is probably a the minimum for a single-threaded
application; more threads increase cache requirements.
|
| qed.scratchCacheSize |
32 blocks |
The "temporary" file cache size. A certain number of
blocks in the scratch file (used for temporary tables and
indexes) are cached in an LRU cache; this parameter controls
the size of this cache.
|
| qed.defaultMode |
rw |
The default database 'mode'. Either 'r' or 'rw'
|
| qed.isCaseSensitive |
true |
A boolean flag specifying whether string (VARCHAR)
items are case-sensitive.
|
| qed.minSyncInterval |
15000 (15 seconds) |
The minimum interval (in ms) between database
sync operations. |
| qed.maxSyncInterval |
60000 (60 seconds) |
The maximum interval (in ms) between database
sync operations. Use this to help
limit the logfile, scratch, and before-images files.
|
Debug builds of the server can generally support various levels
of trace output configured by the Config system:
| Name | Default | Description |
| qed.trace.JdbcDriver |
0 |
bit 0: API execution
bit 1: Database open/close
|
| qed.trace.Connection |
0 |
bit 0: Connection lifecycle
bit 1: Connection methods
|
| qed.trace.Statement |
0 |
bit 0: statement lifecycle
bit 1: statement execution
bit 2: statement + result
bit 4: params in prepared statements
|
| qed.trace.ResultSet |
0 |
bit 0: API execution
bit 1: findColumn
bit 2: rows
bit 3: row + column -> object
bit 5: updateObject(row, col, obj)
|
| qed.trace.file |
0 |
bit 0: Block
bit 1: Block verbose
bit 2: BlockAccess
bit 3: BlockAccess verbose
bit 4: BlockFile
bit 5: BlockFile verbose
bit 9: Log: blobs
bit 11: SubPageManager
bit 12: Datafile
bit 13: CircularBuffer
bit 14: CircularBuffer streams
bit 15: LogSync handle POST
bit 16: Log1.reallyAddEntry()
bit 17: LogSync handle PRE
bit 18: Log restart
bit 19: Log1 save/restore blocks
bit 20: BlockStore.write
bit 21: Log1.reallyCheckpoint
bit 22: Logger1 checkpoint
bit 23: Logger1 checkpoint verbose
bit 24: sync()
bit 25: checkpoint
|
| qed.trace.index |
0 |
bit 0: Btree
bit 2: Bnode
bit 3: BnodeEnumeration
bit 4: Comparator
bit 5: Bnode splits
bit 6: BtreeCursor seek
bit 7: BtreeCursor insert
bit 8: BtreeCursor replace
bit 9: BtreeCursor delete
bit 10: BtreeCursor movement normal
bit 11: BtreeCursor movement exceptional
bit 12: BtreeCursor show keys only
|
| qed.trace.locks |
0 |
bit 0: verbose lock tracing
bit 1: lock tracing
bit 2: really verbose lock tracing
bit 3: transactions
bit 4: dump locktable every 30 seconds
|
| qed.trace |
0 |
bit 0: add/remove relation, dependencies
bit 1: Verbose exceptions
bit 2: Verbose cursor creation
bit 3: ~Verbose data
bit 4: predicate evaluation
bit 5: index constraints on queries
bit 6: key comparison
bit 7: row value operations
bit 8: Connection open/close
bit 9: end/commit/rollback trans/stmt/rs
bit 10: session: statement contexts
bit 11: table row operations
bit 12: table row stream operations
bit 13: TableOp.getRow
bit 14: TableOp.insertRow, updateRow
bit 15: LogStep.redo
bit 16: LogStep.undo
bit 17: Join temporary index creation
|
Security
QED is an embedded database using the local filesystem rather than
the network. At this level, for most embedded applications,
filesystem security is enough. This is why QED doesn't normally
use the JDBC password, and doesn't use the authentication id for
anything other than the initial SQL schema name.
With the release of QED 3.1, you can now encrypt the database
using a key formed by concatenating the JDBC user and password.
Your data is then securely protected with encryption using the new
AES (Rijndael)
standard. The encryption algorithm for 3.1 uses a 128 bit key.
Performance testing indicates a loss of about 10% on the
Fill test and a similar amount on the 'TpcB' test when
encryption is enabled.
Memory Footprint
QED is architected for low memory consumption. This means
that generally, we use streams instead of arrays wherever possible,
and that we use disk-based structures (taking advantage of the block
cache somewhat) instead of in-memory structures for things like
temporary indexes and orderings.
Java doesn't make it very easy to say exactly how much memory your
application uses, because the
numbers you get from inside the VM don't match the numbers when you
examine the external process behavior of the VM itself. That said,
here's what I can report about QED's memory footprint under at least
two VMs:
Methodology: I simply ran each of the java programs
while watching the dynamic system monitor and looking at the VM size
of the process running the VM. I looked for the peak size, which
was generally pretty much the same as the size 15 seconds later
(during the final sleep which I put in each testcase)
I used this class to measure the base footprint of the VM itself:
public class foo {
public static void main(String[] args) {
try {
Runtime r = Runtime.getRuntime();
System.out.println("footprint: " +
(r.totalMemory() - r.freeMemory()) +
" / " + r.totalMemory());
Thread.sleep(30000);
} catch (Throwable t) {}
}
}
Firing up the VM itself and a minimal set of classes has a
base overhead:
Base VM: (as measured by foo.java)
Linux JDK 1.4.1: external: 9288K, internal: 261K
Windows 2000 JDK 1.4.1 external: 5732K, internal: 251K
XmlLoad getConnection (create Database)
Linux JDK 1.4.1 external: 14060K, internal: 677k
Windows 2000 JDK 1.4.1 external: 8140K, internal: 585K
XmlLoad loadDatabase complete
Linux JDK 1.4.1 external: 15060K, internal: 1211K
Windows 2000 JDK 1.4.1 external: 8864K, internal: 698K
QED footprint (XmlLoad max footprint - Base VM footprint)
Linux Sun JDK1.4.1 external: 5772K, internal: 950K
Windows 2000 JDK 1.4.1 external: 3132K, internal: 447K
Performance
QED is designed to perform very well with most typical SQL
operations, given a few considerations:
Indexes:
Creating the proper indexes makes all
the difference on some queries. QED will use indexes that
have columns which match those used in the query/update. QED
will create (and delete) temporary indexes as necessary for
inner join operations where the requisite index doesn't exist
in the inner table.
Non-indexable queries
Some queries, like those
using 'LIKE' or using functions, cross joins, etc., aren't
indexable and use essentially brute-force approaches. This
may be OK if your data is limited, but will fail badly with
large datasets.
Joins
QED has a very simple join plan. Tables are joined left
to right, with the left table being the outer, the right
table being the inner table, in a series of nested inner
loop INNER JOINs wherever possible. Any kind of equijoin
or join on columns will use this approach. Failing a
common column, we'll resort to a cross join, which is a
full cartesian product. The inner table in the cross join
is iterated for every row of the outer, leading to
possibly very long run times.
For INNER joins, performance should be very good if the
inner column is indexed already. (I.e., consider adding
an index in this situation.)
Concurrency
QED fully supports concurrent access, while maintaining
SERIALIZABLE isolation and ACID properties. QED's
LockManager supports a hierarchical lock tree which uses
multiple lock modes to permit multiple readers and a
single writer to each database structure. Locking is
performed at the table level. Table locking implies that
sometimes programs will block, waiting for a table lock,
if it's in use by other transactions in an "inconsistent"
mode. Table locking is also (as with any two-phase
locking approach) subject to deadlock. QED inelegantly
resolves this using a configurable "lock timeout"
parameter.
It's possible (and highly recommended) to avoid deadlock
in the application, by ensuring that table locks are
acquired in a canonical order, thus preventing the
deadlock condition.
Keeping transactions short is another key under a table
locking regime. A long-running transaction clogs up the
log, blocks other transactions, and should generally be
avoided.
In general, these limitations related to concurrency are
the result of a conscious design compromise: QED's target
architecture isn't designed to maximize concurrent
performance. Rather, the objective is to be small and
fast for typical (i.e., single user) access, while
permitting concurrent access for carefully designed or
small scale applications.
Tuning
The single biggest tuning knob is the 'cacheSize' connect
property. Increasing the cache size can help quite a bit
it some applications -- some experimentation may be
necessary to find the best size for your application.
Depending on your I/O system, you may benefit from a
different (the default is 8K) block size; other supported
block sizes are 4K and 16K.
Speed freaks
For single user applications where ACID doesn't really
matter (i.e., you're bulk-loading a database from some
other source, and if the load fails, you'll just re-run
it), you can speed things up by getting rid of the
transaction logger using the connect parameter
'logger=0'
Administration
QED requires zero administration. Still, sometimes you
*want* to administer your data -- with QED, the database is
simply a directory in the filesystem containing files accessed
via a JDBC url using the QED JDBC driver. If you want, you
can use file system backups (ideally when the datbase is
closed, but even a snapshot of a running database should be
recoverable using QED's restart logic.) to backup your
database.
For more direct control over the database (as well as for a
smaller backup file), QED includes the utility classes
XmlDump, XmlLoad. These classes can be run from the command
line:
# my handy java+qed alias :-)
alias qjava=java -classpath $QED/lib/qed.jar:$QED/lib/antlr.jar
# dump the db to a compressed xml file
qjava -Djdbc.url=jdbc:qed:mydb com.quadcap.sql.tools.XmlDump db.xml.gz
# reload the db from the file
rm -rf mydb
qjava '-Djdbc.url=jdbc:qed:mydb;create=true' \
com.quadcap.sql.tools.XmlLoad db.xml.gz
You can also call XmlDump/XmlLoad from your application:
import java.io.*;
import java.sql.Connection;
import com.quadcap.sql.tools.XmlDump;
void dumpDb(Connection conn, String outfile) {
FileOutputStream fos = new FileOutputStream(outfile);
OutputStreamWriter ow = new OutputStreamWriter(fos);
BufferedWriter bw = new BufferedWriter(ow);
XmlDump dump = new XmlDump(conn);
dump.dumpTables(bw);
bw.close();
}
SQL Console
Many people have asked about a SQL consoles for QED. There
are various options available here. First, a pretty nice JSP
implementation of a SQL console, with support for DatabaseMetaData,
is included with QWS. This web application will show you all of the
QED databases currently open in this JVM, and allow you to connect
the SQL console to each database. (You can also create, backup, and
load from backup using the QED "admin" webapp.)
SQuirreL-SQL
is a free, open-source SQL console that works very well with QED.
It's my personal favorite -- easy to install, configure, and use.
iSQL-Viewer is another
free, open-source SQL console with many powerful features. It
has a steeper learning curve than SQuirrel-SQL, but it may be worth
it if you need the extra features.
DbVisualizer is
a free (though not open source) tool that works well with QED.
Fragmentation
QED reclaims space used by deleted objects, but this space
is "locked up" inside the database's data file until new objects
are created. It would be possible to move the remaining objects
around to allow the file to be truncated, but this is a somewhat
scary operation, and isn't likely to be implemented soon, if
ever.
However, the same effect can be achieved pretty easily by
using XmlDump and XmlLoad to dump
the database and then reload it without the holes.
QED database migration
Generally, database formats are pretty closely tied to the
particular QED version that generated them. So, you can't
access a QED 3.0 database using QED 3.1. You can use the
XmlDump + XmlLoad trick to convert
the old database to the new format.
The tricky part of this is that there are bugs in pre-3.0
versions of XmlDump which prevent this from working in the
expected way.
The solution is to use the new "multidriver" class,
com.quadcap.jdbc.MultiDriver. This is a
special JDBC driver which uses its own classloader to load
old QED drivers. The JDBC URL used by this driver is
just like a regular QED URL, with the subprotocol of
mqed, and the additional connection property
qed, which specifies the path to the old
QED jar file.
For example, if the original database URL were:
jdbc:qed:mydb;user=joe
The MultiDriver url to access this database
would be something like:
jdbc:mqed:mydb;user=joe;qed=/home/qed-2.2/lib/qed.jar
This is admittedly a little funky, but it's a simple solution
to an infrequent yet important problem.
|