The CompSci boffins believe they have found a new way to recreate the missing entries in the log files.
In one paper title Bagging recurring event imputation to repair imperfect event log with missing categorical events, Dr Sunghyun Sim and Prof. Hyerim Bae (both from Pusan National University in South Korea), as well as Prof. Ling Liu from the Georgia Institute of Technology in the United States, point out that the log files should faithfully record time stamps, event names and other data.
But for some reason the logs are sometimes imperfect or omit certain records, making it difficult to reconstruct events. Logs with missing lines can also mess up AI training models.
The three authors could not find a tool to recreate the missing events. So they built an algorithm that correlates data from other relevant sources to generate the missing log entries. Essentially, it works by determining which bits of information from multiple sources are needed to form a log entry, and then automating the process of generating missing entries from the available information.
“As data is collected from multiple angles in many information systems, there is a relationship between the data collected,” Dr Sim said. “From this point, our study suggested a method of restoring missing event values using the relationship between entities in the event log, which can overcome human or system errors.”
The authors applied systematic event imputation (SEI) and multiple event imputation (MEI) simultaneously to a recurrent event imputation algorithm (BREI), using bootstrap sampling and the Recurring Event Imputation (REI) to repair damaged event logs. The results were very promising, we are told: testing with real event logs “improved the accuracy of the restore by 10 to 30% over existing restore algorithms.”
“In addition, it could restore almost 90 percent of the data with precision, even when more than half of it was missing.”
The work of the boffins has been published in IEEE Transactions on Services Computing. A summary with a graphic attempting to explain it was published here. The authors express their belief that the algorithms they have developed will soon be put into use by real users in the industry.
The computer scientists slide illustrating the operation of their log repair algorithm … Click to enlarge. Source: National University of Pusan
Hopefully this will only happen with full disclosure of which newspaper lines have been reconstructed and which are original – imputed journals clearly have the potential to make life interesting for digital forensic practitioners. ®