eECHO BLOG

A journey of a thousand miles starts with a single step.

Beware of Autogenerated Schemas

We’ve covered the most important data type considerations (some with serious and
others with more minor performance implications), but we haven’t yet told you about
the evils of autogenerated schemas.
Badly written schema migration programs and programs that autogenerate schemas
can cause severe performance problems. Some programs use large VARCHAR fields for
everything, or use different data types for columns that will be compared in joins. Be
sure to double-check a schema if it was created for you automatically.
Object-relational mapping (ORM) systems (and the “frameworks” that use them) are
another frequent performance nightmare. Some of these systems let you store any type
of data in any type of backend data store, which usually means they aren’t designed to
use the strengths of any of the data stores. Sometimes they store each property of each
object in a separate row, even using timestamp-based versioning, so there are multiple
versions of each property!
This design may appeal to developers, because it lets them work in an object-oriented
fashion without needing to think about how the data is stored. However, applications
that “hide complexity from developers” usually don’t scale well. We suggest you think
carefully before trading performance for developer productivity, and always test on a
realistically large dataset, so you don’t discover performance problems too late.

Comments are closed.