I recently stumbled upon a wikipedia page humoristically describing various types of computer software bugs (http://en.wikipedia.org/wiki/Heisenbug). One in particular caught my attention: Schrödinbug. “…a bug that manifests itself in running software after a programmer notices that the code should never have worked in the first place.” It surprised me a little as I always though a Schrödinbug is one that lays dormant for a long time but once it manifests itself nothing works again until it is fixed.

It reminded me of my favourite bug I introduced in the old days which falls under this second definition of Schrödinbug. Consider the following snippet:

That construct (of using some current time related value as a seed) is often used and in general works well (if not used for cryptography). In this instance however, the unit of the value returned by get_time is peculiar – hectoseconds. The number of 100th of a second since some arbitrary chosen epoch (OS dependent, not 1970). according to documentation (http://docs.oracle.com/cd/B12037_01/server.101/b10755/dynviews_2147.htm) the hsecs wraps to zero to prevent integer overflow roughly every 500 days, seed takes integer as in value and all would be well if not for one thing:

hsecs datatype : NUMBER

seed input datatype BINARY_INTEGER

looks good on first glance, NUMBER is an interesting ORACLE datatype which according to docs (http://docs.oracle.com/cd/B28359_01/server.111/b28318/datatype.htm#i22289)can take the range of:

  • Positive numbers in the range 1 x 10-130 to 9.99…9 x 10125 with up to 38 significant digits
  • Negative numbers from -1 x 10-130 to 9.99…99 x 10125 with up to 38 significant digits

but we’re guaranteed that hsecs will wrap to zero after reaching 232 – 1

Can you already see where this is going ? BINARY_INTEGER which is seed input data type is a signed integer. So – depending on your luck and the choice of a baseline for the epoch, after up to 250 days, this code will crash with an exception that the seed cannot accept the cast of hsecs value as it causes overflow.

This happens to me in real system, with little damage fortunately and was trivial to fix. Nevertheless – lesson learned – be mindful of the whether integers you’re using are signed as this can cause very weird things to happen that will come up after half a year after deployment.