[15:53:14] <pombreda> you can ask it here, or on a ticket at https://github.com/nexB/scancode-toolkit/issues or there is also a more active Gitter chat channel at https://gitter.im/aboutcode-org/discuss
[15:55:33] <jose_ifm> well, I am running scancode in a linux machine traying to extract licenses information of a pretty big folder containing the linux kernel (portions of it)
[15:56:06] <jose_ifm> Afte some hours running, it finishes the scan and start to save results
[15:56:16] <jose_ifm> but it fails with following message:
[15:56:21] <jose_ifm> ./scancode: line 114: 10444 Killed $SCANCODE_ROOT_DIR/bin/scancode "$@"
[16:15:38] <pombreda> for instance it takes 20 minutes to scan a linux kernel on my laptop ;)
[16:15:52] <jose_ifm> I know, it is the stupid VM...
[16:15:55] <pombreda> I have 16GB and quad core though
[16:16:08] <jose_ifm> will try to install it and run it in another machine
[16:16:39] <jose_ifm> A different question: I try to generate a CSV output with '|' instead of ',' as field separator
[16:16:45] <pombreda> but using multiprocessing (the --processes/-n speeds it up a lot
[16:17:38] <jose_ifm> nice hint about the processes - will use it for sure!
[16:18:09] <pombreda> why usinga different separator (and BTW the options for output/format have changed in 2.9b1, and you can create plugins for various format and create multiple formats too)
[16:18:33] <pombreda> now you issue above is that you are getting prockilled by the kernel (e.g. using too much ram quite likely)
[16:19:44] <pombreda> with a large scan with 2.2.1, the scan were all in RAM at the end, hence the prockill IMHO
[16:20:01] <pombreda> the latest develop is/should be less memory hungry
[16:20:07] <jose_ifm> reason for the separator: I need to filter the output file, and using the ',' as separator also separates the copyright/liceses information into fields. With '|' as separator, it is ok
[16:20:45] <pombreda> jose_ifm, ok, but there is more than just commas to CSVs
[16:21:22] <pombreda> if you need to do filtering, the 2.9b1 (and develop branch) have a much better way to do this that fiddle with the output directly
[16:21:55] <pombreda> jose_ifm, for instance this https://github.com/nexB/scancode-toolkit/blob/develop/src/scancode/plugin_only_findings.py
[16:22:39] <jose_ifm> do you know which docu should I read additionally? I already tried --only-findings and no very successfully
[16:22:49] <pombreda> that the --only-findings code
[16:23:13] <pombreda> there is not much docu yet . It needs to be written :P
[16:30:19] <pombreda> so there is really tow ways to go at it IMHO
[16:31:00] <pombreda> 1. create a proper filtering plugin, though if your goal is keep only thing with something, --only-findings should be working for you
[16:31:42] <pombreda> 2. continue to use you LUA post-processing script, and possibly make it easier for you by writing a new output plugin that uses | separators
[16:32:15] <pombreda> with 2. the current CVS plugin uses the default CSV options in Python