Download PDFOpen PDF in browserIs it feasible to identify outputs of an arbitrary process at run time without excessively slowing down workflows?12 pages•Published: December 11, 2023AbstractIn this study, we explore the feasibility of identifying file events for any process in real-time without significant workflow slowdowns, to aid in generating a data provenance report for the dynamic workflow manager, MEOW. Unlike traditional workflow managers, MEOW’s output location isn’t pre-defined, and output can initiate another job. We es- tablished criteria and examined four Linux tools: strace, perf script, inotify, and fanotify. Our findings suggest that strace meets our requirements, and integrating an strace-based tracer into MEOW is both theoretically and practically viable. While the implemented tracer slows the workflow by approximately 1.3 times, worst-case scenarios show it could be up to 5 times. This research forms the base for constructing MEOW’s data provenance report.Keyphrases: data provenance report, fanotify, inotify, meow, perf, strace, workflow manager In: Lindsay Quarrie (editor). Proceedings of 2023 Concurrent Processes Architectures and Embedded Systems Hybrid Virtual Conference, vol 17, pages 81-92.
|