Categories
Blog

TimesTen 778: Log write failed because file system is full

A problem I recently encountered in TimesTen.  So we decided to delete 150gb of data and reload some new data but after finishing the reload and trying to create an index we see the following message:

create index KT.WC_EX_D_M7 on KT.WC_EX_D(CON_QUAL_ID);
 778: Log write failed because file system is full
 The command failed.

We then investigate the tterrors.log file and we can see the following:

12:42:42.27 Err : : 8442: 8448/0x1454a20: LOG: logFlusherBlksWriteReserve: Error encountered while writing to log file 28423: No space left on device; writeno = 452625, startLFO = 83968, lfstat
 e->lfo = 204800, logfilesz = 204800, endLFO = 499712, newest_lfn = 28423; prior to write operation: bytesWritten = 0, bytesToWrite = 415744
 12:43:42.03 Err : : 8442: 8448/0x1454a20: Current log reserve status (log file sz = 1024mb, low space = yes): 0:1024mb, 1:1024mb, 2:-
 12:43:42.04 Warn: : 8442: 13277/0x2333c00: TT0778: Log write failed because file system is full -- file "logmgr.c", lineno 6245, procedure "logMgrInsert"
 12:43:42.04 Err : : 8442: 13277/0x2333c00: LOG: logMgrInsert: Failed at line 6245, lrtype=sbLRHpMerge, idmap[0:0:0+0], extra_flags=4000, xact[0.0], conn_name=tt_aggr_store
 12:43:42.04 Err : : 8442: 13277/0x2333c00: LOG: logMgrInsertComposite: Failed at line 5915, lrtype=sbLRHpMerge, idmap[0:0:0+0], extra_flags=4000, xact[0.0], conn_name=tt_aggr_store
....
 12:43:42.91 Warn: : 8442: 14:39:36.76 Err : : 8442: 8448/0x1454a20: Current log reserve status (log file sz = 1024mb, low space = no): 0:1024mb, 1:1024mb, 2:1024mb

So what’s happened here?, we know we had enough capacity on this server when we started the delete.
What has happened is that when we decided to remove large amounts of data and reload, this information was written into tt_aggr_store transaction log files, waiting to be checkpointed. Once TimesTen has finished check pointing, these files will be removed. But we have a situation here, our filesystem is full and we need to release some space for the checkpoint to finish, if we log into TimesTen and do ttckpthistory we can see the checkpoint is in progress and at 43% by looking at the 2nd column in from the end (on the in progress checkpoint). If after a few minutes you do ttckpthistory again, this number should rise.

Command> call ttckpthistory;
 < 2016-04-21 14:53:54.882654, , Fuzzy , In Progress , Subdaemon , , 0, , , , , , , , , , 43, 2822478 >
 < 2016-04-21 12:12:14.816162, 2016-04-21 14:53:53.882055, Fuzzy , Completed , Subdaemon , , 1, 28422, 1058267952, 5955922, 419430400000, 5374114, 352735081424, 2962607, 244341736136, 190724055040, , 2822477 >
 < 2016-04-21 09:48:30.575903, 2016-04-21 12:12:13.815930, Fuzzy , Completed , Subdaemon , , 0, 28265, 510229032, 6157457, 419430400000, 5493332, 329113821400, 2562203, 231257523952, 154416349184, , 2822476 >
 < 2016-04-21 09:48:00.043669, 2016-04-21 09:48:11.572258, Fuzzy , Completed , Subdaemon , , 1, 28264, 811758384, 6975148, 419430400000, 6716902, 413843632296, 1275, 87546440, 111964160, , 2822475 >
 < 2016-04-21 09:47:30.430319, 2016-04-21 09:47:42.037899, Fuzzy , Completed , Subdaemon , , 0, 28264, 811756336, 6975148, 419430400000, 6716902, 413843632296, 1275, 87546440, 111964160, , 2822474 >
 < 2016-04-21 09:47:00.798261, 2016-04-21 09:47:12.427519, Fuzzy , Completed , Subdaemon , , 1, 28264, 811754288, 6975148, 419430400000, 6716902, 413843632296, 1275, 87546440, 111964160, , 2822473 >
 < 2016-04-21 09:46:30.142190, 2016-04-21 09:46:41.794758, Fuzzy , Completed , Subdaemon , , 0, 28264, 811752240, 6975148, 419430400000, 6716902, 413843632296, 1275, 87546440, 111964160, , 2822472 >
 < 2016-04-21 09:46:00.525457, 2016-04-21 09:46:12.138730, Fuzzy , Completed , Subdaemon , , 1, 28264, 811750192, 6975148, 419430400000, 6716902, 413843632296, 1275, 87546440, 111964160, , 2822471 >
 8 rows found.
 Command>

Solution
To allow this checkpoint to finish you will have to delete any unnecessary log files that may exist or move them to another file-system, and in the future when reloading – maybe load in smaller batches and keep monitoring the transaction logs and checkpoints.